blog

Rails app seed data

date: 2018-03-20

summary: Planning your data model, building your seed file.


Columbo (the great 70s TV detective) would often solve a case with small bits of seemingly unimportant information. The unwitting suspect would admit to key points while Columbo quietly built his case. This post covers some points that may seem unimportant compared to coding your solution, but if done, will "build your case" for a solid dataset and foundation for your development project.

For applications that require a database, there are steps I follow that help me ensure my data is solid before I write any of the application logic or client side views. Like Columbo, this helps me eliminate things that would otherwise clutter my thinking as I develop, debug and add abstraction to my application.

If your project requires a database, take the time to plan it out, considering the table relationships, the column names, meaningful seed data and a reusable script of steps for resetting your data while working locally on your machine, or publishing on a remote server.

Some reasons why I always do this before a project:

I want to have a bullet proof way to reset my database anytime I please, with minimal refactoring or wondering, especially when I end up deploy my apps to Heroku, Netlify, Firebase, AWS or Github pages, I want to be sure that I have solid code for resetting my data on a remote server. With a thoughtful seed file, I have an easy way to add both new fields to my database and add new data to match those fields, removing any guesswork about the effectiveness of the new data.

The idea is to have solid relational data prepared before you begin the thornier parts of client side development. If you are building anything from a small app, to a full blown API serving up data to a front end app, these steps will help you to:

  1. Conveniently Drop, Create, Migrate, Seed and Start your database and application
  2. Keep your focus during development
  3. Eliminate some complexities during debugging, because you can rule out data problems
  4. Have a familiar dataset throughout development, so that: a. Your views are more meaningful b. You will gain insights on ways to enhance your data model, based on your development c. You will have a convenient way to add these enhancements, and painlessly bring your results to life.

When we are in the throes of debugging and trying to make our code work, a great helper can be your seed dataset. You know exactly what the list of 'cats' or 'books' should be and sometimes you don't get 100% clear answers from the developer console or debugger(s).

Summary of steps:

  1. Plan Data Model
  2. Choose Naming Convention
  3. Plan Active Record relationships
  4. Write Database Migrations
  5. Use Active Model Serializer gem
  6. Create Seed file and Test
  7. Create Rake tasks to DCMS (Drop, Create, Migrate, Seed / Start) the DB and App

Why?

The better you can shape your tables around actual user needs, the more relevant your applications will be. Ask enough questions to determine relationships of one-to-many, one-to-one, many-to-many

Most important, you want to think about the relationships between tables, so that you get the complex relationships you need, while you leave simple things alone.

Solid data models are a combination of:

  • practical thinking about how you refer to the data
  • intuitive naming conventions
  • database relationships

Practical Thinking:

Your early instincts about names often prove correct.
If you find yourself calling a table something other than what you've named it, you probably should consider changing to your instinctive names. Be aware that there are 'reserved words' in most languages. You don't want to use 'type' for a table name in Rails for example.

Naming Conventions:

You want to develop your instincts for tight, short names if possible, that reflect the intuitive understanding of your data. You'll type these names hundreds of times in the development of your project, so wouldn't you rather type "zip" instead "postal_code"?

Database Relationships:

image

Talk about your data, using Nouns, Verbs and Adjectives

What is it?
What does it do?
Why does it do it?
When does it do it?
When does it start doing that?
When does it end doing that?
What changes after that?
Where does it exist?
What things does it touch ?
How many are there of it?
How is it created, edited, updated, deleted?

Identify NOUNs

  • Student, Teacher, Post, Task

What are the attributes (ADJECTIVES) of that NOUN?
- student.age, student.major
- teacher.tenure_status, teacher.specialty
- post.title, post.content
- task.title, task.date

What are the VERBS describing things you do with that NOUN?
- student: 'attends' a class, 'submits homework', 'asks a question'
- teacher 'teaches a class', 'grades homework', 'creates lesson plan'
- post: 'is read by a user', 'is liked or disliked', 'is published'
- task: 'is created', 'is overdue', 'is completed'

List the ways the NOUNs relate to each other:
- Teacher 'has_many' Student(s)
- Teacher 'has_many' classes
- Student has one Teacher for one class
- Student 'has_many' Teachers, through classes
- Post 'belongs_to' Author
- Author 'has_many' Post(s)
- Task 'belongs_to' Student

List specific relationships that will require a JOIN table
- You might have a Class table, that 'has_many' Students, and 'belongs_to' a single Teacher
- This would allow a singe Teacher to 'has_many' Students, THROUGH a Class
- The Class table might have these basic fields:
- id (class_id if referred to in other tables)
- teacher_id (the ID of a Teacher)
- student_id (the ID of a Student)
- subject (the subject of the class for example)

  • We would create a new row in the Class table, for every single student_id
  • This allows us to query the database for a single Teacher and get all the students for that Teacher through the rows in the Class table that contain that particular teacher's ID (teacher_id).

For Example

In a recent Rails app for a music teaching studio, I wanted to have tables for Teachers, Students, Resources and Lessons. I talked through each one in my mind, determining the following for my context:

Teacher
- has_many :students
- has_many :lessons
- has_many :resources, through: :lessons

Student
belongs_to :teacher
- has_many :lessons
- has_many :resources, through: :lessons

Lesson
belongs_to :teacher
belongs_to :student
- has_many :resources

Resource
- has_many :lessons
- has_many :teachers, through: :lessons
- has_many :students, through: :lessons

Wherever possible, I wanted to exploit Active Record relationships so that my API data would have not only the related ID's but full related objects.

Takeaways

Nouns and Verbs

  • think about Nouns and Verbs as you imagine your data set,
  • often a single (1) Noun takes action (verb) on (many) of another Noun
  • who has many of what?
  • what belongs to a single who?
  • what are the natural join tables ( a Class joins a Teacher with multiple Students)
  • what join tables do you need to create? ### You might have to literally join two tables with an obvious name such as:
  • project_student
  • so a Teacher might 'has_many' project_students
  • ...and would therefore 'has_many' students THROUGH project_students
  • this would differentiate between the students that the Teacher has via Class, as above

Settle on the types of table relationships you need

  • 1 to 1
  • 1 to many
  • many to many
  • stand alone

CREATE YOUR DATA PLAN

  • Based on your research, plan out your tables, columns and relationships
  • This is one of the most productive things you can do before you start coding
  • A solid plan, documented, provides you a path for inevitable future changes/iterations
  • This forces you to refine your naming conventions

## Speaking of naming conventions...

Here are some thoughs about naming tables and columns.

  • I like easy names to spell and type, since we're typing them hundreds of times while we code. And we're thinking through code with these names in our head, so you want as little interuption in thought (over mundane things you can control) as possible.

  • I like all lower case names if they work within best practices for the context.

    • fname is a little cryptic, a little less readable than firstname
    • firstName is fine, if you never screw up the camel casing
    • firstname is easier becuase you never have to add a though about a capital. It is also only 3 letters different than 'lastname', so you can quickly copy it and edit the copy
  • Short, concise names keep your thoughts clear, and lower case reduces typos
    your mileage may vary and your boss may differ; don't be stubborn if your context requires something specific

  • Your ability to FOLLOW what is asked of you, exactly, shows an employer that you can be a proper part of the business machinery; that you can be relied upon to do what is needed, without argument or improvisation where improvisation is not wanted or useful to the goal

Summary

We need to think through our data, to choose practical, organic relationships.
We want to avoid too many fields for now, but still create enough data interconnection to make our API rich with relational data.


Migrations and Seed file

We covered the internal thinking you must do before coding your Rails application data model. That internal work will save you a lot of rewriting, if you have asked yourself enough questions about the interactions of your tables and the naming conventions you will type 100s of times during your development process.

Have you ever been on a project where you are constantly using the word "title" in your code, but instead the database requires the word "name"? Or my favorite peeve: firstName vs first_name vs fname vs first-name.

It is important to refine WHAT YOU WANT, before you start coding it.
Pseudo coding the database relationships, table names, column names and even the actual Rails database migrations, will:

  1. Make your project flow smoothly from conception to reality
  2. Provide you with convenient paths to follow for changes
  3. Document the database for working with other developers
  4. Force your mind to decide what you need
  5. Reduce the initial 'scope creep' that delays quick iteration
    • 3 or 4 columns in a table for first draft, not more
    • 3 or 4 tables if possible, to get first iteration 'live'
    • breadth before depth, get the body parts alive and working, then get fancy

the Yin and Yang of it

We constantly battle the need to develop something fast, and the need to carefully plan our development.

Yin:

Getting 'something' working quickly, helps you get to your 2nd iteration sooner, where key insights await.

Yang:

Coding immediately after conceiving an idea, could actually slow you down as you get lost in a rabbit hole, or stuck trying to make something work, which you end up abandoning.

CREATE YOUR SEED FILE

Once you have documented what you need for tables, column names and relationships, use this template for creating a seed file. It will be easy to update with any changes you make to your model. And with the rake task below, you can "flush the database toilet" as many times as you want, any time during the project for clarity while developing.

Ruby seed file template for Rails App

  • The template has 2 parts for each database table.
  • The DATA_table_name variable, creates a Hash with keys for each column name, and a key for the sample data that matches the columns. You can edit the column names and sample data easily whenver you need to change your database.
  • The other part is the Ruby method you run to create each new row in the database. It parses the DATA_table_name variable, creating each row.
  • Finally, the seed file will have a master method for calling each of the "make_Data" methods. And a call for the master method to run at the bottom, to respond to your "rake db:migrate" command.
def main
  make_diets
  make_groups
  make_foods
  make_users
end

main

Full example of seed file


Happy coding!