java – How far should I follow the conventions, where can I apply specific self-styled patterns?

Question:

How far should I follow the conventions in Java code, that is, how far is convention a rule?

Can I develop and apply my own styles of coding patterns in my code, knowing I'm going to use that pattern throughout?

Answer:

In fact, what is really a rule is what is imposed by the compiler, the rest is a convention.

There are in practice two conventions in force. In particular, the Eclipse community tends to have a different formatting standard than the folks at Oracle (who inherited Sun's style). Many groups tend to follow the Eclipse or Oracle format, with some smaller groups mixing them up or adopting conventions that diverge from both in some parts. In most cases, such as where or not to place spaces around operators, casts, methods, parentheses, generics declarations, array indexes, etc., the style tends to be uniform. The biggest divergences are about { , about indentation and about switch .

Getters and Setters

There are many frameworks that deduce what the properties of your objects are by looking at getters and setters methods. If you use a different convention at this point, the frameworks won't find your properties or handle them correctly, in which case you end up being forced to follow the conventions.

Getters must have a name starting with get followed by a capital letter, have a return other than void and have no parameters. If the return type is boolean (the primitive type), the prefix can be either get or is . Some frameworks (but not all) allow the is prefix to be used also when the return type is Boolean (the Boolean class).

Setters must have a name starting with set followed by a capital letter, have exactly one parameter, and have a void return. Some (but not all) frameworks allow the return to be of the type of the class that declares the method. Also, in most cases, the type of the setter parameter should be the same as the return of the getter.

Identifiers

For the rest, strictly speaking you are usually free to use it however you like, as long as the compiler accepts it. However, the conventions are still worth using. One of the reasons is the syntax coloring, as in the code below:

class Classe1 { // A palavra Classe1 vai ficar azulada.
}

class classe2 { // A palavra classe2 vai ficar preta.
}

int teste; // A palavra teste vai ficar preta.
int TESTE; // A palavra TESTE vai ficar preta.
int Teste; // A palavra Teste vai ficar azulada.

Notice the color difference. The idea is that in the case above, class names are bluish while the other identifiers are black. However, if you don't follow language conventions, StackOverflow will color the words wrong. It turns out that this is not a problem unique to StackOverflow and many other programs will have the same problem.

And namely the naming convention of identifiers is as follows:

  • Names of classes, interfaces, enum and constructors must have the initial of each word capitalized, with the other letters lowercase. Words are not separated. The use of numbers is allowed. For example: StringBuilder , JPanel , Consumer , NullPointerException , LayoutManager2 .

  • Method names, local variables, instance variables, static variables other than constants, method parameters, lambda parameters, and packages must be lowercase first letter, with the initial of each word other than the first capital and all other letters also lowercase. For example, toString , getClass , isVisible , setEnabled , clone , element , start , getX1 , temp , x , y , f1 , etc.

  • Package and module names (Java 9+) must follow the reverse domain pattern and have all lowercase letters with no word separation. Numbers are allowed. For example, java.lang.annotation , javax.swing , javax.validation.constraintvalidation , org.json , org.apache.commons.lang3 , org.springframework.web.servlet , java.base , android.service.quicksettings . Names starting with java , javax and javafx are or should be restricted to JDK and the others ideally should follow the reverse domain pattern (but this is not always the case in practice, like android). After the part of the package name that matches the domain, the logic used when dividing the packages into subpackages is the project organization logic and not the project naming organization logic. That is, you should use com.example.meuprojeto.bancodedados instead of com.example.meu.projeto.banco.de.dados .

  • Constant names (that is, immutable objects declared with static and final ) and enum elements must be all uppercase with words separated by _ . Numbers are allowed. For example, SOUTH_EAST , TOP , TIMED_WAITING , DECIMAL_FLOAT .

  • Generic type parameters are denoted by a single capital letter, such as Map<K, V> or List<E> .

There are also some frameworks that, as with getters and setters, require that the method names have a certain structure. This was actually quite common before annotations were introduced in Java 5, where tools like JUnit 3 required test methods to have the name prefixed with test and EJB 2 also had several naming rules. With the advent of annotations, these method name restrictions (considered annoying) have been progressively removed (as in JUnit 4 and EJB 3), but every now and then there is one or another framework that still does some sort of imposition.

general spacing

As for spacing, the issue is much smaller, but readability can still be affected, so this issue is also important.

There are basic rules such as:

  • Never put spaces before a comma or semi-colon, but always put them after.

  • Always place spaces around binary operators, but not around unary operators.

  • Do not put a space immediately after ( , { or [ .

  • Never place spaces inside (possibly generic) types, except after the comma that is inside a generic parameter list. That is, List<Map<String, Thread>> and double[] are ok, while List <Map< String , Thread> > or double [] are not.

  • Put space right after the cast. That is, int x = (int) y is ok, while int x = (int)y not.

  • Never put spaces immediately before line breaks (they are invisible and useless and only serve to create version conflicts in tools like Git and SVN).

  • Plus a bunch of other little details.

The location of the {

There is a disagreement as to where to place the { at the beginning of the class, interface, method, if , else , while , for , do...while , synchronized , try , catch or finally . This divergence has existed since the time when C was starting. There are essentially two styles in use:

  • Put the { at the end of the line of the block it is opening – This is the style that Sun adopted and that Oracle followed. This style was created by Brian Kernighan and Dennis Ritchie, creators of C, and was also adopted by Bjarne Stroustrup who created C++ and by Linus Torvalds who created the Linux kernel. Example:

     if (x) { // blablabla }
  • Put the singly { on a line just for you. This pattern was started by Eric Allman who created BSD Unix in C, being strongly influenced by the current Pascal pattern, which uses the begin and end keywords to delimit blocks, and begin usually placed isolated on its own line. In Java, this is the standard adopted by the Eclipse community. Example:

     if (x) { // blablabla }

There are also other ways to decide where the { is placed and some variants in certain special cases. This divergence has already caused some lengthy debates and flamewars on mailing lists and internet forums (and of course it also existed in the scope of StackExchange ). I personally follow the style adopted by Oracle, with a small but: in the declaration of methods and constructors, when the parameter list is large and ends up being divided into several lines, I use the { in its own line so that it is highlighted, when instead of just hanging on the last parameter line.

Maximum width of a line

This point can also be controversial. Most conventions dictate that 80 or 79 columns is the limit.

However, this limit comes from older terminals and consoles and old printers from the 1980s and earlier that had a limit of 80 columns on screen/paper. Today this limit is more than exceeded.

Also, Java is a very verbose programming language, and because of that, it's very easy and customary to end up going past the eightieth column. Pushing the 80-column limit can leave multiple statements and expressions split over such a large number of lines (even more so if they have multiple indentation levels) that it will make the code significantly more difficult to read and understand.

Therefore, I consider a limit of 120 to 160 columns to be ideal. I don't give a specific number, as I think it depends a lot on personal preferences and particularities of each project and any number I give would be just my personal opinion.

indent size

This one is the biggest of all disagreements and the biggest cause of flamewars and fights about styles on the internet.

There are two issues involved here. The first is whether the indentation is with tabs or spaces. The second is that if spaces are chosen, how many spaces are there.

First, whatever indentation criteria you choose, you must be consistent. Indenting some lines sometimes with tabs and sometimes with spaces is the worst of all worlds. Even worse when the same line mixes tabs and spaces in the indentation. If using spaces, always use the same amount of spaces to represent an indentation, otherwise it will look horrible and inconsistent.

Communities in C, C++ and other languages ​​have a myriad of different parties, each with its own niche and its disputes with respect to the other parties. In Java, there are essentially only two parties: Ident with 4 spaces or indent with 1 tab. Sun originally recommended either way, and the Eclipse community embraced the second. Later (around the time of Java 5, I think), Sun changed its mind and standardized for itself and started recommending the first form only (4 spaces).

In my personal opinion, indentation with spaces is better because:

  • In theory, the tab-indented code should work with any size of tab to be adopted by the user reading the code, so the choice of the exact size would be up to this one. In practice however, only the exact size used by the original coder will work and if two or more developers have changed different parts of the same code using different tab sizes, it will go wrong anyway, regardless of the tab size used. .

  • When using spaces, the code I've written will be seen by my neighbor exactly as I do. Likewise, the code my neighbor wrote will be seen by me exactly as he sees it.

  • Having to keep setting the tab size in each editor for every different code I find out there sucks.

  • Many software, including email clients and web browsers do not easily allow the tab size to be set (most assume a tab is 8 spaces, some assume it is 4). These software also have no way of guessing which tab size would be most suitable in each situation.

  • Paying attention to tab width is something the user who is just occasionally browsing a page or reading emails on mailing lists shouldn't have to worry about.

  • Respecting the maximum line width is much more difficult when using tabs instead of spaces, as my tab size may differ from my neighbor's tab size.

  • If you get an error message saying that something in column 33 of row 82 is wrong, and that row is indented with tabs, figuring out what exactly column 33 in that row is can be a little difficult.

In practice, many communities in various programming languages ​​are very slowly abandoning tabs and adopting space-only indentation. Sun itself ended up doing it for the reasons described above. Python 3 indentation conventions also contraindicate tabs and accept them only to maintain compatibility with code written in earlier versions.

This process of migrating tabs to spaces out there is very slow (taking decades) because there are many people out there who do not give up using tabs, they simply hate indenting with spaces and there is a lot of software out there that uses the tab as a form pattern of indent. This is the hottest debate and point of disagreement on the issue of code-writing conventions.

The case s switch

This one is also a point of disagreement, although it is much smaller than the previous three. In practice there are two conventions competing with each other. Are they:

  1. Put the default seo case at the same indentation level as the switch :

     switch (x) { case 1: // Blablabla case 2: // Blablabla default: // Blablabla }
  2. Put the default seo case with one more indentation level than the switch :

     switch (x) { case 1: // Blablabla case 2: // Blablabla default: // Blablabla }

The } before the else , catch and finally

This one is a silly detail, but there are three different styles:

  1. if (x) { // blabla } else { // blabla } try { // blabla } catch (AlgumaException x) { // blabla } finally { // blabla }
  2.  if (x) { // blabla } else { // blabla } try { // blabla } catch (AlgumaException x) { // blabla } finally { // blabla }
  3.  if (x) { // blabla } else { // blabla } try { // blabla } catch (AlgumaException x) { // blabla } finally { // blabla }

People who use { in their own line almost always adopt the first style.

Those that use { along with the declaration of the block being opened tend to adopt the second style, but in some cases may prefer the third.

Array-type variables

There are two equally valid ways in Java to declare an array:

  1. public static void main(String[] args)
    
  2. public static void main(String args[])
    

In general the first form is considered superior because in it you follow the pattern [variable type + variable name] that applies to all other ways of declaring variables in the language. The second form, on the other hand, is much less readable, and is present only because it was inherited from C and C++, because in it you declare first a part of the variable's type, followed by the variable name and then the rest of the type, and in this case the information about the variable's type gets scattered in two distinct places unnecessarily.

Alignment of parameters on multiple lines

This one is somewhat controversial and concerns the placement of parameters of methods and constructors, when these are very numerous. Therefore, consider the following cases:

  1. public String metodo(
            int x,
            int y,
            int z);
    
  2. public String metodo(int x,
                         int y,
                         int z);
    

Both forms are found out there, but I personally am only in favor of the first for the following reasons:

  • The first form maintains the same indentation pattern as all the rest of the code, and doesn't cause any line to end up being indented by an amount of spaces that isn't multiples of the tab's size.

  • The second form is fragile, because if you decide to change the method name, or the return type or something regarding the modifiers static , public , protected , private , strictfp , abstract , final , default or native , you'll have to worry about don't mess up with the alignment of the parameters.

  • If the second form is done by indenting with tabs, the result will be disaster. Often, it will be necessary to mix spaces and tabs in the indentation, as the size of the indentation in the parameter lines may not be multiple of the size of the tab. Furthermore, only a certain specific tab size will produce the proper indentation, and the idea that any tab size would do goes to hell.

In the first form above, normally the indentation given to the parameters in relation to the declaration of the method name is double. The reason for this is so that it is at a different indentation level for both the method body and the declaration itself. For example:

public String meuMetodo(
        int parametro1,    // Dois níveis de identação além do cabeçalho.
        int parametero2)
{
    return "abc";          // Um nível de identação além do cabeçalho.
}

Braces after if , else , while and for

The use of curly braces ( {} ) after if , else , while or for is optional in Java if the body is a single statement (a characteristic inherited from C and C++). In fact, this is because the body of these blocks is defined as being a solitary instruction or a set of instructions delimited by braces.

Note the following two ways:

  1. if (x) {
        fazerAlgumaCoisa();
    } else {
        fazerOutraCoisa();
    }
    
    for (int i = 0; i < 10; i++) {
        System.out.println(i);
    }
    
  2. if (x)
        fazerAlgumaCoisa();
    else
        fazerOutraCoisa();
    
    for (int i = 0; i < 10; i++)
        System.out.println(i);
    

The two forms are equivalent, and some people like the second form. I am strongly opposed to the second form because it is very prone to accidental oversights:

if (x)
    fazerAlgumaCoisa();
    fazerOutraCoisa();

estamosForaDoIf();

Note that in this case the indentation misleads and makes it look like the call to the fazerOutraCoisa(); method fazerOutraCoisa(); is inside if , when in fact it is outside. This often ends up causing the programmer to end up fooling himself and writing buggy code, which could be prevented by always adopting the braces in the if , else , for and while blocks.

Another disastrous case:

if (x)
    //estamosDentroDoIf();

estamosForaDoIf();

In the case above, when commenting the line inside the if , the next line that was outside the if ended up sneaking inside it!

Another case:

if (x)
    if (y)
        System.out.println("x e y são verdadeiros.");
else
    System.out.println("x é falso.");

Note that the else appears to be on the outside if , but it's actually on the inside if , and the code won't do what the programmer thinks it would.

There's just one exception that I think it's worth not putting the braces in the if (but that's my personal opinion). Which is when the if has no else and is on a single line:

if (x) fazerAlgumaCoisa();
estamosForaDoIf();

The try , catch , finally , switch , do...while and synchronized blocks do...while not suffer from this problem because the keys are required in them.

Commented line indentation

Another difference between the convention used by Sun/Oracle and Eclipse concerns the indentation of commented lines.

  • Sun/Oracle Style:

     public class X { public void x() { // Esta linha é um comentário. int x = 5; // x++; } }
  • Eclipse Style:

     public class X { public void x() { // Esta linha é um comentário. int x = 5; // x++; } }

Personally, I hate Eclipse style, as spaces between // and text are no longer an indentation as defined (spaces at the beginning of the line), and if the indentation is done with tabs, it will result in tabs in the middle of the line instead of just at the beginning, which is horrible. Also, an inattentive reader might not notice that the x++; line x++; is commented out, even more if there are many levels of indentation and the editor used has no syntax coloring or one that is inappropriate.

Others

There are other concepts to consider as well, such as:

  • Should line breaks be placed before or after binary operators in very long logical or mathematical expressions? The idea of ​​putting it before is what is prevailing because it makes it clear that the line in question is a continuation of the previous one.

  • Where to wrap lines in method calls with many complex parameters?

  • What is the best way to order the attributes, methods, constructors and inner classes within a given class?

  • What is the order of annotations to be applied to classes, attributes, methods, and constructors?

  • What are the best ways to give good names to classes, methods, attributes, parameters, and local variables, avoiding names that get too long while being descriptive and understandable enough?

  • Encode everything in English or use identifiers with names in Portuguese (or some other different language)? Are cases that lead to identifiers that mix two different languages ​​acceptable? If you want everything in English, are the project programmers very fluent in English?

  • Put a line break at the end of the file or not?

  • Where to put blank lines inside the code of some method?

  • Should the source code line breaks be \r (Mac), \n (Unix/Linux) or \r\n (Windows)?

  • Should the character encoding to be used be UTF-8 or ISO-8859-1? UTF-8 has proven to be increasingly advantageous in this dispute due to better standardization, less probability of unpleasant surprises with encodings and the possibility of encoding any character from anywhere in the world, including emojis 😂.

  • A lot of other little details you can imagine.

The convention you should adopt

Ultimately, the choice of the following conventions is ultimately up to you. In the case of naming identifiers, I see little reason to shirk convention since, while it actually could have been better, you'll already be using a bunch of library classes and methods that follow the standard convention (even the package ones java.lang ), which means that if you try to go against this, you would end up creating code with a heterogeneous and non-standard style.

On the other hand, regarding the choice of tabs vs spaces, indentation size, { at the end of the line of the starting block or on its own line, maximum line size, where to place or not spaces, etc., this is something that remains more at your discretion and where you have more freedom of choice. Just think about the pros and cons of each approach before making a decision and whatever decision you make, be consistent and consistent in it.

checkstyle

There is also a tool widely used in many professional level Java projects called checkstyle . This tool checks if the code in the Java language conforms to the style rules that you define in the project, reporting any violations, however small.

The tool is very flexible and configurable, having integration with all the IDEs widely used today and allowing you to specify in an XML file which style rules to adopt. The tool is free, open source and its development is very active, with frequent updates and is ready for Java 11.

Scroll to Top