Friday, July 8, 2016

Introduction: Putting the Right Code In the Right Place

For the past few days we have been introducing new concepts in an introductory manner.  Today we are going to finish up introducing new concepts so that we can get to fleshing them out on Monday.  If you have any comments, even if you disagree with my assertions, leave them below.  If you want to email us click here.  On to today's topic: Putting the Right Code in the Right Place.


Putting the Right Code in the Right Place.


Databases are extremely good at managing data.  What they aren’t good at is coming up with an easy way to get what you want, and then showing it to you.  This is why we have procedural languages like Java, VB (Visual Basic) and C# (pronounced SEE SHARP).  To muddy the waters further, the procedural languages are also good at defining a user interface, or the part the user sees and types his or her data into.  Finally, Stateless is the flavor of procedural programming that pertains to the internet.  Stateless refers to the fact that web development isn’t stateful, that is, anything that needs to be persisted between calls has to be persisted somewhere other than memory.  This generally means a database or a cookie.  While we will touch on statefulness and persistence, they are a little beyond the scope of this article.  What isn’t beyond the scope of this article is how these layers should be designed and interact. 


Generally we will discuss later in the book the various layers that we can divide our software into and why, but for now we will just jump in and start swimming.  The bottom or lowest layer is the data layer.  This is the table structure or schema of the database.  This holds all the attributes or fields of each entity or table in the database.  Well, a table is not necessarily an entity, but I haven’t written that book yet.  The very top layer contains the Graphical User Interface (GUI).  This GUI is everything the user sees.  The only real logic that goes into the GUI is data validation.  There will be more on data validation later, but the reason we do it here is to avoid a round trip to the database, avoiding using cycles on the database server and network resources just to make sure the user doesn’t have fumble fingers.  Everything else is termed middle tier(s) and may include a data access layer, a data abstraction layer, a business logic layer and anything else that a savvy developer might think up.  Click here for a discussion of n-tier architecture that you will need to understand to grasp the concepts presented today. 

The Data Layer


While this may seem to be the easiest to design of the layers, it is probably the most important.  Messrs. Boyce, Codd and Kimball have all received Ph.D.s for coming up with rules defining the best ways to do it.  Sometimes they are right.  The thing to do here is to make it as easy on yourself as you possible.  Some developers prefer to really separate out the database entities and use joins in stored procedures and views to pull everything back together others prefer a less normalized approach to avoid some of the complication the higher normal forms entail.  For an overview of the Boyce-Codd forms of normalization see this.

The Middle Tier


We are going to show, eventually, that not only the quickest and easiest way to do things is usually the best.  Writing a very simple data access layer that only uses stored procedures for data operations is what is called for here.  Before you junior guys start yelling, yes, you can do unstructured searches in a stored procedure.  For you procedural purists, no, there is no logic in the data layer.  Stored Procedures exists on the database server, but conceptually are in the middle tier.  They also give you several huge advantages.

1.       Stored query execution plan
The code in a Stored Procedure is optimized for the machine it runs on and the database structure it supports, making it the fastest way to access data.  Yes, I know that there is a stored execution plan for ad hoc queries as well, but you have to go find that code for those and then republish if you need to change it.  Smart developers don’t use ad hoc queries, ever.

2.       Ancillary code
A smart developer can add code for auditing, change tracking and security right in the stored procedure and nobody really ever has to know it is there, it just works.  If you do need to modify in, however, you know where to find it instead of looking for it for half an hour and then realizing it is in a trigger.

3.       Security
By allowing the user access to only the Stored Procedures allows the Administrator to disallow access to the table structure.  That means that the users can’t get to data they aren’t supposed to see, payroll for example, and for structure changes at the table level without breaking existing code and requiring a recompilation and deployment.  Further, the malicious hacker is disallowed from bypassing the security in the stored procedures and can't see any data without being authenticated. 

The middle tier should be as simple as possible, with as little code as necessary to get the job done.  Generally this consists of a way to pass information back to the database easily.  We are going to coin a new phrase here: primogeniture.  In feudal times this meant the right off the first born to inherit the estate of the parents.  In this case, similarly, we are going to state that a child can’t exist without is parents.  Much like the data layer’s entities, we are going to have relationships between the middle tier objects.  If we have a car object, it may have a tires collection.  This tires collection can only be updated in the database when it is added to an existing car object.  This thin veneer over the data layer is what makes this particular sauce special. 

Integrated data access is essential here.  In the above car/tires example, the tires collection has an “Add” method that takes a tire object as a parameter.  This “Add” method only adds the tire to the collection.  The Tires collection object has a constructor that takes a CarID as a parameter and returns the collection of tires on that car.  More on this later.

The Graphical User Interface (GUI)


Just like there should be no logic in the data tier, there should be no logic in the GUI tier.  Just like Stored procedures only existing on the database server, but conceptually being in the middle tier, data validation that we discussed above isn’t logic.  Data validation is just extinguishing the fuse on a bomb that potentially could blow up in our collective face.  Since the GUI is specific to the application and isn’t going to change even if the database does.  Data validation falls into three categories:

1.       Number Validation
Everything the users type in is a string.  We make sure that if we need a number, the stuff they type in can be converted to a number.

2.       Text Validation
A phone number (for example) is text.  You can’t perform math on it and come up with some meaningful response.  It can be validated by checking to see if all the input is in numerical format and is long enough.

3.       Range Checking
If you are designing a medical application you can be pretty sure you aren’t going to have people over ten feet tall.  Yes, I have run into this DOZENS of times. 

The best way, of course, to do data validation is not to have to do it at all.  Any time you can avoid typing, do so.  The drop down list box/list box is your friend.  Making the contents of one list box depend on the selection of a previous list box (cascading) is ideal.  If you are describing a university, and you have a list of classes you want to offer, when you pick the classes, the instructors list consists of only those who are qualified to teach that class.

Anytime there is a list of things that is more than five or seven items long, do a search.  People can only see six things at a time and most of us, me included, don't even see that, we see two sets of three.  If you are looking for people, search by name or phone number or date of birth or some criteria or any criteria, but do a search.  As we have shown, databases are really good at filtering data.  Use that capability to make your applications fast, easy to use and efficient.

When you do a search, generate a list of things to choose from.  While this sounds like simple common sense, it really isn’t.  If I am searching for students, what I am really looking for are the student’s details, like address and telephone number.  The list of found students generated from the search should be little more than links to the student’s details. 

The Conclusion of the Introduction


Designing and developing software is an art that the world has NOT mastered.  To make it happen there must be one overall unified design with as much automation and as many standards as possible.  Simplicity is the key factor here.  Second only to simplicity is flexibility.  Simplicity will allow you to actually write the software in some reasonable amount of time and teach others how to do the same, and flexibility will allow you to adapt the same solution to any problem.  In the following weeks, we are going to demonstrate in the following chapters the easiest way to accomplish these two objectives.

Again, comment or email your questions.  We look forward to hearing your thoughts.

No comments:

Post a Comment