blog
date: 2018-03-20
summary: Planning your data model, building your seed file.
Columbo (the great 70s TV detective) would often solve a case with small bits of seemingly unimportant information. The unwitting suspect would admit to key points while Columbo quietly built his case. This post covers some points that may seem unimportant compared to coding your solution, but if done, will "build your case" for a solid dataset and foundation for your development project.
For applications that require a database, there are steps I follow that help me ensure my data is solid before I write any of the application logic or client side views. Like Columbo, this helps me eliminate things that would otherwise clutter my thinking as I develop, debug and add abstraction to my application.
If your project requires a database, take the time to plan it out, considering the table relationships, the column names, meaningful seed data and a reusable script of steps for resetting your data while working locally on your machine, or publishing on a remote server.
I want to have a bullet proof way to reset my database anytime I please, with minimal refactoring or wondering, especially when I end up deploy my apps to Heroku, Netlify, Firebase, AWS or Github pages, I want to be sure that I have solid code for resetting my data on a remote server. With a thoughtful seed file, I have an easy way to add both new fields to my database and add new data to match those fields, removing any guesswork about the effectiveness of the new data.
The idea is to have solid relational data prepared before you begin the thornier parts of client side development. If you are building anything from a small app, to a full blown API serving up data to a front end app, these steps will help you to:
When we are in the throes of debugging and trying to make our code work, a great helper can be your seed dataset. You know exactly what the list of 'cats' or 'books' should be and sometimes you don't get 100% clear answers from the developer console or debugger(s).
The better you can shape your tables around actual user needs, the more relevant your applications will be. Ask enough questions to determine relationships of one-to-many, one-to-one, many-to-many
Most important, you want to think about the relationships between tables, so that you get the complex relationships you need, while you leave simple things alone.
Solid data models are a combination of:
Your early instincts about names often prove correct.
If you find yourself calling a table something other than what you've named it, you probably should consider changing to your instinctive names. Be aware that there are 'reserved words' in most languages. You don't want to use 'type' for a table name in Rails for example.
You want to develop your instincts for tight, short names if possible, that reflect the intuitive understanding of your data. You'll type these names hundreds of times in the development of your project, so wouldn't you rather type "zip" instead "postal_code"?
What is it?
What does it do?
Why does it do it?
When does it do it?
When does it start doing that?
When does it end doing that?
What changes after that?
Where does it exist?
What things does it touch ?
How many are there of it?
How is it created, edited, updated, deleted?
What are the attributes (ADJECTIVES) of that NOUN?
- student.age, student.major
- teacher.tenure_status, teacher.specialty
- post.title, post.content
- task.title, task.date
What are the VERBS describing things you do with that NOUN?
- student: 'attends' a class, 'submits homework', 'asks a question'
- teacher 'teaches a class', 'grades homework', 'creates lesson plan'
- post: 'is read by a user', 'is liked or disliked', 'is published'
- task: 'is created', 'is overdue', 'is completed'
List the ways the NOUNs relate to each other:
- Teacher 'has_many' Student(s)
- Teacher 'has_many' classes
- Student has one Teacher for one class
- Student 'has_many' Teachers, through classes
- Post 'belongs_to' Author
- Author 'has_many' Post(s)
- Task 'belongs_to' Student
List specific relationships that will require a JOIN table
- You might have a Class table, that 'has_many' Students, and 'belongs_to' a single Teacher
- This would allow a singe Teacher to 'has_many' Students, THROUGH a Class
- The Class table might have these basic fields:
- id (class_id if referred to in other tables)
- teacher_id (the ID of a Teacher)
- student_id (the ID of a Student)
- subject (the subject of the class for example)
In a recent Rails app for a music teaching studio, I wanted to have tables for Teachers, Students, Resources and Lessons. I talked through each one in my mind, determining the following for my context:
Teacher
- has_many :students
- has_many :lessons
- has_many :resources, through: :lessons
Student
belongs_to :teacher
- has_many :lessons
- has_many :resources, through: :lessons
Lesson
belongs_to :teacher
belongs_to :student
- has_many :resources
Resource
- has_many :lessons
- has_many :teachers, through: :lessons
- has_many :students, through: :lessons
Wherever possible, I wanted to exploit Active Record relationships so that my API data would have not only the related ID's but full related objects.
## Speaking of naming conventions...
I like easy names to spell and type, since we're typing them hundreds of times while we code. And we're thinking through code with these names in our head, so you want as little interuption in thought (over mundane things you can control) as possible.
I like all lower case names if they work within best practices for the context.
Short, concise names keep your thoughts clear, and lower case reduces typos
your mileage may vary and your boss may differ; don't be stubborn if your context requires something specific
Your ability to FOLLOW what is asked of you, exactly, shows an employer that you can be a proper part of the business machinery; that you can be relied upon to do what is needed, without argument or improvisation where improvisation is not wanted or useful to the goal
We need to think through our data, to choose practical, organic relationships.
We want to avoid too many fields for now, but still create enough data interconnection to make our API rich with relational data.
We covered the internal thinking you must do before coding your Rails application data model. That internal work will save you a lot of rewriting, if you have asked yourself enough questions about the interactions of your tables and the naming conventions you will type 100s of times during your development process.
Have you ever been on a project where you are constantly using the word "title" in your code, but instead the database requires the word "name"? Or my favorite peeve: firstName vs first_name vs fname vs first-name.
It is important to refine WHAT YOU WANT, before you start coding it.
Pseudo coding the database relationships, table names, column names and even the actual Rails database migrations, will:
We constantly battle the need to develop something fast, and the need to carefully plan our development.
Getting 'something' working quickly, helps you get to your 2nd iteration sooner, where key insights await.
Coding immediately after conceiving an idea, could actually slow you down as you get lost in a rabbit hole, or stuck trying to make something work, which you end up abandoning.
Once you have documented what you need for tables, column names and relationships, use this template for creating a seed file. It will be easy to update with any changes you make to your model. And with the rake task below, you can "flush the database toilet" as many times as you want, any time during the project for clarity while developing.
def main
make_diets
make_groups
make_foods
make_users
end
main