Feeds:
Posts
Comments

Posts Tagged ‘Software development methodology’

I have been using Netbeans IDE for many years and it is my IDE of choice. I can’t explain why Netbeans and not e.g. Eclipse or IntelliJ. It does not mean I consider other IDEs as inferior – not at all. For example I and my colleagues have chosen Eclipse RCP as a platform of one of my more significant projects – Sabre Red Workspace.  I’m just personally more comfortable with Netbeans and I see it getting better and better over years.

So recently I learned two good news:

    1. Oracle decided to continue evolving Netbeans (which was owned by Sun previously). This was not obvious since Oracle has its own IDE (Jdeveloper) and they might have decided to terminate Netbeans. I’m glad it is not the case (thanks Oracle!).
    2. Netbeans finally adopted OSGi as its runtime module container. This is good since it had its own solution for this and the team was evolving it over years while most others converged on OSGi. This looked suboptimal at least. Also using OSGi potentially brings possibility of using Eclipse plugins in Netbeans, too.

It seems after Oracle acquisition the future looks good for Netbeans.

Slides are here.

Read Full Post »

Comparison of software tools (languages, IDEs etc)  is one of the favorite topics on various forums. Sometimes it takes a form of a heated debate akin to those between Catholics and Protestants during religious wars of 15-16 centuries (e.g. see this post and comments to it). It is no surprise taking into account emotional attachment many developers have to their favorite tools.

I have never seen much sense in most of such discussions, though.

Consider this example: what tool is better: screwdriver or hammer?

Depends on what you want to do, right? OK, let’s suppose you need to hammer in a nail, so the hammer is more appropriate. Or, maybe a nail gun? The choice of the hummer or the nail gun will depend on which one makes sense from economics standpoint (if you need just to hammer in a single nail once a year, buying a hammer is better justified; if you need to hummer in 100K nails, a nail gun sounds as a better choice despite its higher price).

Also it depends on availability of people able to operate the tool: what is the use of nail gun if you could not find anybody able to operate it? Maybe a hummer could be a better choice in such case?

What tool is better: 1 pound hammer or 2 pound hammer? Depends on an individual which will be using it: in general case the 2 pound hammer is more efficient, but it may be too heavy for the given individual to handle; then 1 pound one is better.

This tells us that advantages/disadvantages of any tool must be evaluated only within a context which includes:

  • Task which we want to accomplish with the tool
  • Availability of people able to use the tool
  • Economics of the process
  • Capabilities of the individual or the team who are going to use the tool

There are different software development tasks, different skill levels of developers, different economics of the process, different skills available… That’s why comparing software tools (languages, IDEs etc) outside of a concrete context does not make sense to me.

Unfortunately this is exactly what happens in many discussions about software tools. People come from different contexts and they start comparing tools without defining the context explicitly. And it results in somebody absolutely convinced that the tool ABC is the best thing invented since sliced bread and her/his opponent absolutely convinced that both the tool and the former individual are evil and totally wrong :)

Read Full Post »

We have discussed  craft software development in several previous posts. While I personally like software craft production methods, as a software architect I want to understand how to make software development more efficient in mass production environment.

Software mass production has several distinctive characteristics:

  1. Developer working in such environment faces large amounts of code she/he is not familiar with; often it is legacy code written years or decades ago by somebody who is not available anymore. There is no much time to learn the code; the developer must become productive on that code very soon.
  2. One can’t assume that a developer working on the code will have skills level higher than average; having a developer with low skills must not be a risk.
  3. Cost/time to market considerations are of paramount importance.

The first point means that developers often are not familiar with the code they are supposed to work with and have very little time to learn it. This is a big problem, but there are ways to alleviate it.

The code should be broken into relatively small modules with very clearly and explicitly defined boundaries (even at the level of code repository) and contracts between them. My observation is that time necessary to learn unfamiliar code grows at least as square of the code size as the latter increases. Having it broken into well encapsulated modules helps to reduce the amount of code that has to be learned.

A better design also helps to understand the code (see here) as well as other basic techniques.

All necessary elements of the code must be explicitly described by the code. You think it is always the case, do you? Not at all. Think of dynamic languages -  for those languages type definitions that are important parts of the code (code metadata) do exist in developers heads only and never expressed/documented explicitly. Or think of implicit agreements that are used in Perl. You can find much more examples if you look around.  While such things may be not a big problem when a developer knows the code she/he works with well, this is a big problem in software mass production and as such must be eliminated.

The second point means that the code base and the tools must be such that even a beginner developer will produce a code of good quality.

While having code modular and explicit definitely helps, I think this is more about proper tools. Think, for example, about differences in how strings are defined in C++ and Java. While C++ strings caused countless problems, a way strings are defined in Java just eliminated completely the type of issues.

Basically for software mass production we have to choose deliberately tools that do not allow common problems to happen or at least warn about possibility of them happening. We better help the developer to avoid problems than rely on her/him to fix them.

This is not just about software languages. Think, for example, about differences between Maven and Ant. Ant is a sort of free-form tool while Maven is much stricter and prescribes certain ways of doing things. Maven is obviously better suited to software mass production.

The third point is of paramount importance for software mass production while it is of little to no importance to open source and lesser importance for software craft production.

Let’s suppose your company has a problem it wants to solve by developing certain software. There are two options: a technically excellent solution (architecture/technology/design etc) and just satisfactory one. Which one would you choose? The first one? Not so fast – you haven’t yet considered all important inputs for the decisions. For software mass production we have to consider cost and time (this is business after all).

Let’s suppose the first solution will cost $10M and it will take 5 years to implement while the second one will take 9 months and will cost $0.9M. Now advantages of the first solution are not so obvious, right?

Let’s add one more input… Your company has just $1M budget for developing the solution and if it is not available within a year, your company will be out of business. Given this, which solution is better?

Cost/time considerations are important for all choices we make in software mass production, not just big ones. They are exact reasons why developers working in software mass production environments e.g. should use modern IDEs as opposed to just vim or another text editor, should use object-oriented languages as opposed to procedural ones (the procedural ones are 5 times more expensive to develop – see here), etc, etc.

Probably, there are more issues that should be considered if we want to have efficient software mass production. But those three look to me as the critical ones. Disregrad of any of them will likely cause many problems.

Read Full Post »

As we discussed in earlier posts from this series (the first one is here, the previous one is here), while it is possible and practical having small software development teams doing software craft production, it is impossible to have a large software craftsmen team existing for a prolonged period of time. Well, maybe not impossible, but this would be a rare occasion.

By all means IT industry as a whole nowadays have no choice but to produce software while employing average developers. Since developers with average job market skills and expertise are not capable of software craftsmanship, some other ways to produce software of satisfactory quality should be used.

This situation is not unique. Many industries have faced such problem in the past. Cars, airplanes, radio sets, toothbrushes etc, etc  were craft produced initially. However at some point the demand outgrew capacity of craft production and mass production methods were employed. Mass production in a broad sense can be defined as:

Mass production is the name given to the method of producing goods in large quantities at low cost per unit. But mass production, although allowing lower prices, does not have to mean low-quality production. Instead, mass-produced goods are standardized by means of precision-manufactured, interchangeable parts. The mass production process itself is characterized by mechanization to achieve high volume, elaborate organization of materials flow through various stages of manufacturing, careful supervision of quality standards, and minute division of labor.

There is an important difference between  mass production in general and mass production of software.  Normally mass production results in large quantities of near identical products. Mass production of software results in a large number of substantially different products (I’m not talking here about printing identical installation CDs, I mean writing code).  To be precise, I should have used term job production for this type of mass production; but  in my opinion it obscures the meaning and I prefer to use less precise but more explicit mass production term.

IT industry started facing necessity of software mass production at the beginning of 1960s. This was a time when new more affordable computers were introduced (e.g. IBM System 360 and DEC PDP-8 models). Until then computers were rare and software was relatively simple, therefore there were enough folks capable of software craftsmanship to fulfill the demand. But at that point computers became affordable for various businesses  and demand for new software development exploded.

Attempts to apply old craft production methods to software mass production resulted in many unpleasant surprises and failed projects. One of the early analysis of the problem could be found in a classic book The Mythical Man-Month by Fred Brooks. Mr Brooks wrote this book based on his experience as a  development manager in charge of OS/360 – operational system for IBM System 360 computers.

Many different ideas and “panaceas” were introduced over time, many of them all but forgotten (although many have become a part of usual practice). For example, who remembers now hot debates around structured programming  (1960s)and almost religious wars that were waged  over use of GOTO statements? Noticeably, those ideas were introduced first in the same time period (1960s) when IT industry encountered the challenge of software mass production.

Still, I think that situation didn’t improve much until at least 1995-2000 time period. It is very indicative that Tome Love and Richard Wiener wrote in their book Object Lessons written at around 1995: “The successes come from craftsmen, the failures from engineers!” (p.20) which I interpret as they saying that software craftsmen deliver while software mass production fails. My observations support this and I believe that attempts of software mass production were failures rather often until approximately year  1995 – 2000.

One may say that big software products/systems were successfully developed before 1995-2000. True. The point is that failures happened frequently, more frequently than it is acceptable (see Object Lessons for examples).

So what changes happened at around 1995-2000? We’ll discuss this in the next post.

Read Full Post »

We have discussed craft production of software in this and this posts. As I mentioned, it has many advantages, but it has one big problem: it can’t scale beyond certain limits at both individual project level and at industry level.  There is simply no way IT industry can employ exclusively expert developers  capable of doing software craft production. It must use developers that can’t be software craftsmen.

This may look like an obvious point, but I found that many folks in IT industry have hard time agreeing with it. Let’s discuss this in more details.

Let’s assume your company is planning to develop a software product and you are thinking of how to staff the development team. In one of my previous posts I have mentioned that industry average is 15K lines of source code per member of development team per year. One can say that if we calculate throughput just per developer, it will be higher. True, but not completely. Other members of a development team perform certain functions that otherwise developers should have performed. If there is no business analysts and testers on a project, developers must perform business analysis and testing, right? So even if we compose the entire team just of developers, it is unlikely that an average throughput per team member will be much higher.

Another popular argument is that best developers have much higher throughput than average. This is also true. But how much higher? There are opinions (Tom Love) that the best developers are up to 25 times more productive. I’m a bit skeptical that such big difference exists in reality, but there is the difference indeed. I had a case where I could measure the difference with certain statistical reliability. I have been observing a team composed of 7 developers who did just bugfixing and nothing else for 2 or 3 months. One developer out of this group fixed 6 times more bugs in total during the time period than each of remaining 6 developers in average did.

Still, let’s be generous and assume that best developers can steadily outperform industry average by 10 times.

This means that if you plan to complete development of a product having 3 millions lines of code in two years, you will need 10 outstanding developers or 100 average ones. You bet you would have hard time hiring 10 outstanding developers. But what if your company is planning to develop 5 such products in parallel (which is not unusual for big modern companies)? Possibility of hiring simultaneously 50 outstanding developers is really very remote.

One may argue that nowadays most software development is about maintaining and evolving  existing software products. True. However there are limits to amount of code which an average developer can efficiently maintain. Per Tom Love it is up to 50K lines of source code per developer.  Again let’s assume that best developers are capable of maintaining 10 times more (500K). Well, how many developers would we need to maintain 20M lines of code? 40 exceptional ones or 400 regular developers.

Well, again somebody may argue that the solution is in better candidate selection process (e.g. one of Tom Love’s ideas as well).  It is easy to show that this does not work for bigger development teams.  And here is why.

Let’s assume that your selection process is very good and it selects a single expert developer with probability Px. Then probability of hiring a team consisting of n experts is p=Px**n (Px in power n), which is an exponential function. Here are probabilities of hiring all-expert teams of different size for different probabilities px  (0.9, 0.8, 0.7, 0.5):

As you can see, even a very good selection process with Px=0.9 gives us less than 50% probability of success while hiring a team of 10 expert developers. In reality I would say most of selection processes have Px around 0.5, so hiring a team of 3 expert developers would be problematic.

An interesting consequence of this is the fact that small startup companies have much higher chances to have all-expert development teams. This is the important fact which itself has  interesting consequences we’ll discuss in further posts.

Also, accordingly to Law of Large Numbers,  the more developers we hire, the more their skills will be trending to the industry average.  “Nothing personal, just statistics” :)

All the above says to me that it is possible to assemble a substantial team of expert developers for certain projects for a limited time; however it is unrealistic to expect that nowadays the whole IT industry will be able to staff  developer teams with expert developers or that a substantially large development team may be staffed by experts for a long (many years) period of time.

Conclusion: modern demand for software development can’t be fulfilled by software craftsmen alone.

This is not an unusual case. Many industries were using craft production initially (e.g. car manufacturing, aircraft manufacturing etc), but at some point that had to employ another organizational forms of production to be able to fulfill growing demand.

We will discuss this in later posts.

Read Full Post »

I have got a number of comments to my previous post from this series. They mostly revolved around definition of software craftsman. This is an interesting topic which probably deserves a more detailed discussion.

To start with, we have to distinguish between  software craftsman role and qualities necessary for an individual to fit into the role. The role is a developer producing a code in  the craft way. Not each developer is capable of doing this successfully. At the same time not each developer capable of software craftsmanship does actually act as craftsman; e.g. she/he may work in a company which does not produce software in the craft way. Only when the role and the capability come together we have a real software craftsman.

What is craftsman in a broad sense? It is an individual which is producing a relatively complex product (clock, knight’s armor, beer, glass vase, shoes etc) on his/her own from start to end with good quality as far as end user of the product is concerned.

If we translate the definition to software development, I would say that

Software craftsman is a developer capable of architecting, designing, implementing, testing and deploying to production a substantially large software product of good end-user quality on his own  (with possible aid of business analysts).

Probably all would agree that this requires an individual having considerable skills and expertise, i.e. an expert developer in a broad sense. I also think that most would agree that not each and every “rank and file” modern developer has such level of expertise. In fact, my observation is that just a small fraction (maybe not more than 10% or 20%) of overall software development populace can be classified as all-around experts.

The definition  above refers to quality of the product which in my opinion contributes further to the confusion of opinions  around what is software craftsman. This is because quality of software is a complex multi-dimensional notion and many people mean different things while talking about quality of software.

There are two major  aspects of quality of software: external quality which is quality of the software as seen by its end user (end user quality) and internal quality which is quality of the software as seen by IT folks, i.e. developers, testers, operational support engineers etc.

Understanding the difference is important since those two aspects of quality do not necessarily come together. A perfect example can be found on many web sites. Look e.g. at the web page were you are reading the post (the page design is courtesy of WordPress which I’m using to publish the blog; all credit for the design goes to WordPress).  The page looks good, its purpose is clear and it is very easy to use for an end user, right? Now take a look at the page source . What would you see? Rather not-so-clean HTML code with quite a bit of hardcoded CSS styles, JavaScript scriplets etc. External quality is good, internal quality is rather not as good.

Software craftsmen invariably strive to build products of good end user quality. Of course, this is not always possible given many external factors like corporate budgets, release schedules etc. But still, they build as good products as possible given the circumstances they operate within.

Internal quality is a different matter. Often, when a software craftsman works alone or with just few other craftsmen, they may produce a code which does not look so good from a standpoint of quality standards of mass produced software. There may be no comments, few or no unit tests, a lot of non-standard components, no automatic build, no automatic deployment etc.  However if the software is written with the intent of it being used by other IT folks (e.g. open source projects), internal quality of such software when produced by craftsmen is almost invariably good since in this case the craftsman considers other IT folks as end users.

By the way, software which looks good for a developer may have poor quality for other members of IT industry as well. Surprise? Well, this is what often happens in reality. Even well-written software may be a nightmare to automate its testing or to deploy/upgrade etc. For example, how many times you have seen software which is very easy to roll back and to make its previous version operational in no time in case problems were found after a new version was deployed? My experience is that this capability is very rare while it can make life of operations folks much easier.

There is much more than that to quality of software and I can spend considerable amount of time on discussing software quality, but let’s return to a definition of software craftsman.

Ability to come up with an idea for an innovative product is not a necessary part of software craftsman’s skill set. In fact, very few craftsmen in broad sense as well as software craftsmen have ever come with innovative product ideas. Their distinction is rather in building/manufacturing  products rather then inventing them.

Formal training in particular area (IT in our case) is not a mandatory pre-requisite as well. True, having good formal training definitely helps (e.g. both Bjarne Stroustrup, creator of C++ and James Gosling, creator of Java, have Ph.D. in Computer Science) . However I have known several very good developers who were essentially self-taught.  For example  true IT legends, creators of C and co-creators of Unix  Dennis Ritchie  and Brian Kernighan. The former was trained in Physics and Applied Mathematics,  the latter was trained in Engineering Physics and Electrical Engineering.

This is true for other occupation as well. Charles Darwin was trained as priest, Arthur Conan-Doyle was a doctor, Charles Chaplin didn’t get any systematic education whatsoever.

Before I finish the post, I’d like to mention an interesting observation I made. I have noticed an interesting feature which I almost invariably see in software craftsmen: they very rarely if ever use debuggers. As soon as I see a developer who successfully and quickly writes code without resorting to a debugger, this almost invariably means that the individual can potentially work as craftsman.

Why is this so? This is because software craftsmen deliberately attempt to write code with very few bugs and  to write the code in such way that the bugs could be found and fixed very easily without any debugger. They do this because it is much more time efficient to write bugless code than to write the code with bugs and the fix them applying complex and slow tools such as debuggers etc.

When I say that it is possible to write code with just very few bugs  to a broader audience they often look at me in incredulous manner and often challenge me on whether this is possible at all.

Yes, this is possible and I know this for sure. It is not too simple to learn because it requires making certain practices an automatic part of how a developer does her/his work, but still this is not a “rocket science” and can be achieved within a year or two. I have mentioned certain such practices in one of my previous posts; good code design helps as well.

If you are interested to find more about practices helping to write a code without bugs, I would highly recommend you reading the book by Glenford Myers. It is an old book by IT standards, it was published in 1976 (e.g. it was written before object oriented design, internet and PCs and even before some of modern software developers were born :) ) and sometimes a bit outdated, but the bulk of the recommendations is still very valid and rock solid right.

Read Full Post »

Software development exists for around 70 years. Its purpose is to produce ( to manufacture) software and from this standpoint it is just one of many manufacturing industries. Car industry makes cars, clothing industry makes clothes etc, software industry makes software.

Despite this software development is still positioned apart from manufacturing industries as something very different. Even the terminology is different. Software is not manufactured, it is written or developed, right? Agreed, there are specifics of software development that keep it apart from shoe manufacturing. However there are specifics in shoe manufacturing that keep it apart from computer manufacturing, right? In my opinion, there are many things in common and looking at parallels between software development and manufacturing may bring better understanding where software development is as an industry and where it might be heading.

Let’s for example, look at organizational forms of software development.

There are different organizational forms in manufacturing industries in general. Arguably the oldest one is craft production. That’s what Wikipedia says about it:

Craft production (or One-off Production) is the process of manufacturing by hand with or without the aid of tools. The term Craft production refers to a manufacturing technique applied in the hobbies of Handicraft but was also the common method of manufacture in the pre-industrialized world. For example, the production of pottery uses methods of craft production.

A side effect of the craft manufacturing process is that the final product is unique. While the product may be of extremely high quality, the uniqueness can be detrimental as seen in the case of early automobiles.

As many other manufacturing industries, software development started as a craft production of the software code. The code was often “manufactured” by hand with little aid or without aid of tools (IDEs, debuggers etc).

For example, in 1980s I worked with a colleague, an outstanding software developer Viktor Krivcov. He was writing code exclusively in assembler language using just a basic text editor. No IDEs, no debuggers, no builds etc. He had been developing code with astonishing speed. For example he developed ASPECT – quite complex (for that time) interactive multi-user CAD system for simulating electronic circuits – almost single-handedly just within a year. The system was written in an assembler language for BESM-6 mainframe. Interestingly, Viktor steadfastly refused using any high-level language available at a time on the mainframe (FORTRAN, ALGOL, C – to mention few) and I suspect he considered that coding in them is not for real software developers :)

Please note, while talking about craft production, I do not put any negative connotation to it. In fact, I am a co-signer of Manifest of software craftsmanship and I share its values.

Craft production of software is still well alive and flourishing. It often produces products of extremely high quality as Wikipedia stated about craft production in general. Just several typical examples: Nginx , Redis and RabbitMQ.

Nginx is a web server which is an approximate functional analogue of Apache web server but it performs the work and scales much more efficiently. I did certain scalability tests of both in the past and I was really impressed by Nginx. Nginx is one of few web servers designed to handle so-called C10K problem. And guess what: it was developed singlehandedly by Igor Sysoev (probably there are other developers working on it by now).

Redis is an extremely fast NoSQL DB. Its initial version has been developed by Salvatore Sanfilippo alone.

RabbitMQ is very fast, scalable and flexible messaging middleware written in Erlang. Its initial version (prototype) was developed by just two developers ( Matthias Radestock and Matthew Sackman) as far as I know.

What are distinctive characteristics of craft production of software?

  1. All developers are highly skilled, often with education level higher than average (e.g. Viktor Krivcov has had Ph.D., Matthias Radestock has had Ph.D., Matthew Sackman is in a process of getting it). Of course, being highly educated does not necessarily mean having a degree.
  1. All developers are highly motivated
  1. Very small teams used, often consisting of one or few developers
  1. There is no separation of roles within the team as a rule. Technical architect, quality assurance and often business analyst functions as well as many other functions are performed by the developers
  1. Choice of technologies and tools is often (but not always) is very special, sometimes outside of the current IT mainstream. Redis and Nginx are written in ANSI C (not even C++!), RabbitMQ is written in Erlang, ASPECT had been written in assembler
  1. The teams tend to create and use their own tools instead of existing ones.For example, Redis team developed their own load test tool for Redis despite many load testing tools available
  1. Very little hardware is used (and, consequently, less investment is required). Things like certification environments, staging environments, UAT environments are almost invariably absent

#1 and #2 are critical mandatory pre-requisites for craft production of software being successful. C to G are consequences of the pre-requisites. If A and B are present, #3 to #7 come along naturally.

Why, for example, developers employed in craft production of software often use technologies outside of the current IT mainstream? Two reasons:

  • They choose technologies that are the most suitable for the task and not technologies that are the most fashionable or the technologies they know better. RabbitMQ team had chosen Erlang because it produced incredibly fast networking code, Salvatore had chosen ANSI C since he wanted to eliminate all overheads added by more advanced languages like C++, Viktor had chosen assembler since only it allowed him to build the system he envisioned within limited memory and CPU speed of the mainframe designed in 1967.
  • Their advanced skills allow them to be comfortable and productive with technologies that are hard to use for developers with average skills

Craft production of software is very cost efficient and this is why it is often used by small startup companies. There are actually two main variations of them:

  • Startup companies that produce software for IT industry (like Nginx, Redis, RabbitMQ). They sometimes consist of just the development team and almost nobody else
  • Startup companies that produce software for other industries e.g. aviation industry. Such companies almost invariably consist of a subject matter expert/business analyst who is not a developer but an expert in the target industry and a development team

Craft production of software has many positive sides. But it has a big problem, too, as far as IT industry is concerned. It does not scale to the whole IT industry.

It is a minority of developers who satisfy A and B requirements above. My observation is that probably not more than 20% and likely just 10% of overall development community fit to the software craftsman role.

Certain companies do large scale software development and they naturally need many developers. Unfortunately it is impossible to staff big development teams with highly skilled and highly motivated developers only. There is just not enough them on the market. True, probably certain companies like Apple or Google could do this due to their standing within IT industry. But by and large it is impossible.

Even smaller companies that need just few software developers often could not hire developers capable to craftsman software development.

So if software development industry wanted to scale, it had to find ways how to produce decent software with just average skilled and average motivated developers.

This problem is not unique. At some point almost any industry faced similar problem. For example car manufacturing. It started with small craft shops. But soon with growing demand they found that there were not enough skilled workers able to produce cars on a large scale while employing craft production organization.

The answer many manufacturing industries found to such challenge is mass production. We’ll discuss mass production of software in the next post. See also the post about definition of software craftsman.

Read Full Post »

Well, I’m about to start development of a social application (Squirrel2 project) and I have to decide on what technology stack I’m going to use. This is a very important decision: making a wrong technology choice can easily result in project failure.

Here is a good research on the current technology stack of well-known social applications. Looks a bit intimidating, right? :)

Do I really need all this software? I don’t think so.

All new social applications face two major risk at the beginning:

  • Their development may be not complete at all. Most new social applications (with some exceptions like Google+) are developed either by individuals or small teams with very modest financing. In both cases resources are very limited and there may be just not enough time or money to complete development. This is very true in case of Squirrel2 – I have just around 500 hours for the whole development accordingly to my one year challenge.
  • Even if the initial development has been completed, the application may not get enough traction with its potential users. This happens more often than not. Therefore developers should bring a first version to production as soon as possible so that to verify its viability before too much money or effort is spent. This implies minimum of the initial development as well

So the bottom line is: the initial technology stack should be such so that to allow launching the first version of the application as soon as possible and with as less investment and development as possible.

Notably, initial versions of many successful social applications were built using technologies that are more known for rather rapid development then scalability or high performance:

  • The initial version of Twitter was built in Ruby on Rails.
  • The initial version of Facebook was built on PHP.

And what about scalability, ability to serve millions of concurrent users? I wish I had that problem :) Those issues will be dealt with when and if it becomes necessary.

Squirrel2 technology stack must satisfy following requirements:

  • Main development environment/main technology should be optimized for quick development
  • At the same time it should have a clear transition path to more scalable solution
  • Technology stack must minimize amount of new development; the more ready made solutions I can use instead of doing my own development, the better. I will make exception for technologies that I’m specifically interested to learn, though
  • Technologies that I already know are preferable whenever feasible; I don’t have much time for learning given that the whole Squirrel2 development budget is 500 hours
  • Naturally, the technology stack should be suitable for development of web applications

I have been thinking of the technology stack for some time. Here is what I decided to use.

Main technology

Since my main competence as a software developer is in Java, I could choose that language as the main technology. However it would be somewhat boring… I want to learn new things, remember?

So after some research I have decided to use Grails.

A primary reason for this is that it is a rapid development technology. Yeah, I know it was inspired by Ruby on Rails, so should I have used RoR?

RoR is very good technology. The reasons why I decided on Grails and not on RoR are following:

  • I just happen to like Groovy (the language used by Grails). I think it yields in a very elegant and easy-to-understand code which I consider very important qualities. Absence of those qualities is the reason why e.g. I did not select Scala and Lift which I evaluated as well; to me Scala code is just somewhat cryptic and difficult to read.
  • Groovy produces quite compact code, about twice shorter than corresponding Java code. This is also a very important advantage in my opinion.
  • Groovy++ and Java (ability to mix Groovy and Java, to be precise) offer a clean and easy gradual path to future performance improvements in case I would need improve it. RoR does not offer a similar path as far as I know.
  • Ability to mix Groovy and Java allows leveraging a huge collection of high quality open source software that Java accumulated over time (remember, my goal is to minimize amount of development).

Grails comes with a lot of plugins that provide functionalities I hope to leverage. Grails community is vibrant. In short, Grails is “what the doctor prescribed” :) Decided: my choice is Grails.

Database

Grails is mostly used with relational databases. I could use e.g. MySQL which have worked well for me in my previous projects. However this time I have something else in mind – namely, NoSQL DBs.

I think NoSQL DBs are a part of a major paradigm shift happening right now in IT industry and I want to have hands on experience with those technologies. Funny enough, unlike much hyped Web 2.0 or SOA this paradigm shift, which in my opinion is much bigger and more important than those two, happens almost unnoticed. I will write a post or two on the paradigm shift and NoSQL DBs soon.

There are Grails GORM plugins for several NoSQL databases: MongoDB, Riak, Redis, HBase, Gemfire.

Hbase is for huge data grinding facilities; it is too big for my needs. Gemfire is proprietary, so I am left with MongoDB, Redis and Riak each of which is quite good. I will think a bit more of this and I will describe my choice later in a separate post.

Presentation layer

Naturally, social applications use web technologies at their presentation layer. So will be Squirrel2 doing. I plan to use HTML5 and Rich Internet Application technologies.

In particular, I consider using jQuery, jQuery UI and plugins to them. I like very much architecture of jQuery and actually it is a sort of standard de-facto in modern web GUI technologies.

Also I plan to use Coffeescript language for writing JavaScript code. I consider latter a very promising technology and want to try it.

That’s all for today. I will describe my choice of tools and supporting technologies in the next post.

Read Full Post »

We have discussed basic rules of writing code in the previous post. What else could a developer do to increase her performance?

She should learn basics of good object-oriented design! My observation is that many developers don’t believe that all those rules of object-oriented design (OOD) do matter. They apparently think that the rules were invented by various pundits to sell their books or teaching services on software development methodology.

In fact, the whole object-oriented design was invented to improve developers performance. I’ll show this in a minute.

Let’s suppose you follow the six rules I described in the previous post. What other factors inhibit software developer performance? Those are:

  1. Complexity of the design which is a part of software complexity
  2. Tight coupling
  3. Code duplication

The more complex design is the more details you have to keep in mind which increases probability of making mistakes. For example, introducing a bug while using API that consists of just 3 methods is much less likely than when working with API consisting of 3000 methods, right?

If a component you work with is tightly coupled with 10 other components, you have to keep in mind details of all 11 components, right? This again increases probability of making a mistake.

Code duplication is like slow poison. Initially it is so simple to cut and paste a piece of code. But later you have to:

  • Remember that the duplications exist
  • Each time you change one instance, you need to find, to change and to test all other occurrences of the code fragment
  • Those duplicated fragments gradually diverge as time goes by and at some moment you will find that a change that works correctly within one instance of the once duplicated code breaks another instance

All this multiplies time that a developer needs to develop and to maintain a code.

Object-oriented design is a way to address tight coupling and code duplication:

  • Inheritance allows reusing a code without duplicating the code we want to reuse
  • Polymorphism allows modifying a reused code in a clean way
  • Encapsulation of data reduces coupling
  • Good object-oriented design reduces complexity and makes the code easier to work with

So that to understand how good object-oriented design helps to improve developer’s performance, let’s take a look at e.g. a set of OOD principles known as SOLID:

  • Single responsibility principle (SRP), the notion that an object should have only a single responsibility.
  • Open/closed principle (OCP), the notion that software entities should be open for extension, but closed for modification
  • Liskov substitution principle (LSP), the notion that objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program
  • Interface segregation principle (ISP), the notion that many client specific interfaces are better than one general purpose interface
  • Dependency inversion principle (DIP), the notion that one should depend upon abstractions; do not depend upon concretions

SRP suggests that an object must do just one single thing. It is simpler to use an object that represents just Car and nothing else than an object that represents Car and Coffee Grinder simultaneously, right?

OCP principle suggests that if you want to modify behavior of a class, instead of modifying the class itself better create its subclass which will implement the desired modified behavior. Reason for this is simple. If the class already exists, there may be a part of the code which expects it to behave in particular way. If you want to modify the behavior, you have to find all such places, understand how they will be affected, modify the properly, test modifications… Creating the subclass and using it where needed is much simpler and requires much less work.

LSP is just a way to verify if particular inheritance is correct. You would not make Car class a subclass of Road class, right? You feel that something is wrong with such inheritance and LSP explains what is wrong (Car can’t be used in cases when we need Road, e.g. one can’t put concrete tarmac on Car, right?). Wrong inheritance makes code much more difficult to understand and to use.

ISP just says that that several smaller classes are easier to work with than one big one.

DIP helps to improve developer’s performance in two ways. First of all it reduces coupling, as explained here. Second, the code designed with dependency inversion in mind is much easier to unit test, which is a huge advantage.

Overall, well designed code is much easier to develop and maintain.

Read Full Post »

This post is about writing code in a most efficient way. It should not be confused with writing an efficient code. The former is about efficiency of a developer, the latter is about efficiency of the code execution.

Strangely, few people ever talk about writing code in efficient way. For example various agile practices are concerned of how to organize a development team. Various books and tutorials on Object-Oriented Design talk about how to design the code. But few if ever talk about how to make just basic daily work of software developer more efficient.

I have been graduated from university in 1980. In those times most of computers were mainframes and terminal access to them was not yet so common. To run a code one should have punched her code on punchcards and submitted the deck to the mainframe crew to run. Then next day she was getting a printout of the run which might have been successful, but most likely it contained a sort of compilation or runtime error. The developer looked at the error message, tried to figure out what went wrong, changed the code by replacing some punchcards with new ones and repeated the exercise.

As you see, it was an expensive and slow process.  And most of time I was not writing a code;  great majority of time I was spending on finding bugs, fixing them and re-testing to make sure that the fix worked (although times changed, the same is true for many developers even now).

All this made me thinking that if I just could write a code without bugs or with just few bugs in the first place, this could greatly increase my performance as a software developer. I started thinking on how to achieve this and this resulted in six basic rules of code writing below.

Arguably, reading an excellent book “Software Reliability: Principles and Practices” by Glenford J. Myers was a major turning point in my thinking. One of the main ideas I have got from the book was following:

Most bugs happen just because developers who write, extend, modify or try to use a code simply do not understand what the code is doing or how it is doing this.

Here is what Mr Myers wrote (p.152 of 1976 edition of the book):

…(T)he programmer’s goal must be writing source code for the primary audience of people instead of machines. Doing this requires focusing attention on the clarity, simplicity, and understandability of the code at the sacrifice of less important criteria such as brevity (number of key-strokes to initially write or type the program) and machine efficiency.

This means if we just could write a code which is very easy to understand and don’t provide a room for misinterpreting it by humans (not computers!), then probability of introducing bugs would be greatly reduced. One advise which Myers was giving was to make the code well formatted which was in-line with then-popular ideas of structured programming. This did make sense to me: proper indentation gives a clear view of the code structure.

This makes the first rule: format your code properly and consistently.

Another idea came from some article, I believe (I don’t remember which one, unfortunately). The article was saying that human brain has limited “scope” which can be kept in focus at any moment. With regard of software code the size of the scope is just from 10 to 15 lines of code. When a developer analyses a bigger chunk of code, she “shifts” her mental scope up and down the code. The code which is outside of the scope at the moment is not remembered in each detail; just a sort of a general summary exists in the brain.

This suggested a very simple but powerful second rule: do not write methods/procedures/functions that are longer than 15-20 lines of code. A method of such size fits into the mental scope entirely, so no “scope shifting” is needed.

Coincidentally the rule addresses certain other bad practices, too. Take, for example, cyclomatic complexity. Try to write a code with high cyclomatic complexity within 15 lines :) It is possible, but requires a special effort without which the code will have low complexity almost “automatically”.

Why does the cyclomatic complexity matter in our case? There are researches (e.g. this one)showing that a code with higher cyclomatic complexity has higher probability of having bugs; e.g. for Java code with complexity higher than 74 this probability is 0.98, i.e. the code is almost guaranteed buggy.

The third rule is sort of generalization of the first one: do not write classes/source files that have more than 15-20 methods which means they should not be longer than 200-400 lines. The logic behind this rule is simple: the list of all methods of the class should fit into mental scope as well; in such case it is easy to understand what functionality the class provides and understand its design.

The fourth rule is derived from  Flesch–Kincaid readability test: do not write expressions that contain more that 2 or 3 operators.  The logic here is simple: if you look at Flesch Reading Ease Score (FRES) formula, you will notice that the more words are in a phrase, the more syllables in a word there are, the more difficult the text is for understanding. Variables and constants in expressions are like words in a phrase or like syllables in a word; having not more than 2 or 3 operators limits number of participating variables and constants to just 3 or 4 which makes our expression a short “phrase”.

What to do if you really need much more complex expression? Make it a method and invoke the method in place where you need the expression. One may argue that the method invocation still has to list all those variables. True, but:

  • Invocation of the method which encapsulates the expression is still simpler than the expression itself since it contains just a list of variables and does not have all nested parenthesis, operators etc
  • Having the expression as a separate method gives us a chance to write a good unit test for the expression which is often difficult when the expression is “hidden” inside a bigger method which contains other things besides the expression

The fifth rule is: give meaningful self-explanatory names to classes, functions, methods, variables etc, but don’t make those names longer than 20-30 symbols. I probably don’t need to explain why the names should be meaningful and self-explanatory. But why should they be short? The reason is the same FRES formula: too long names are difficult to read. Also, if you by chance use a language that does not require declaring variables explicitly, there is an increased risk of typo which normally results in a difficult to find bug.

But what 20 or 30 symbols is not enough to explain the purpose of the variable or the class? This leads us to the sixth rule: do not economize on comments; add them each time they can help to make your code more readable. Yeah, I know, developers don’t like writing comments. Well, do they like brushing their teeth? Not everybody like this, but (almost) all do that. We just should do this. Period. You bet, you will appreciate the comments yourself when you will return to the code you have written in few years. Your colleagues will appreciate them, too, believe me.

So here are my code writing rules again:

  1. Format your code properly and consistently.
  2. Do not ever write methods/procedures/functions that are longer than 15-20 lines of code.
  3. Do not write classes/source files that have more than 15-20 methods which means they should not be longer than 200-400 lines.
  4.  Do not write expressions that contain more that 2 or 3 operators. If you really need much more complex expression, make it a method and invoke the method in place where you need the expression
  5. Give meaningful self-explanatory names to classes, functions, methods, variables etc, but don’t make those names longer than 20-30 symbols.
  6. Do not economize on comments; add them each time they can help to make your code more readable.

This won’t make you expert developer over night. There is much more to it than just those 6 rules. However if you apply the rules consistently, you’ll see soon your productivity greatly increased.

Next post will be about good Object Oriented Design and why it matters for high-performing developer.

Read Full Post »

Older Posts »

Follow

Get every new post delivered to your Inbox.

Join 55 other followers