Shevek ([info]shevek) wrote,
@ 2008-05-07 18:30:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
Entry tags:java

I'm quite happy and bouncy, I just thought this was cool, so I'm making a technical post anyway.

I think that statement labels are one of the most underused features of Java. It's elegant, and a very simple solution to a problem which people often solve using boolean variables and complex additional conditions, incurring considerable runtime overhead. Most people probably know that this is valid Java:

LABEL: {
    ...
    if (cond) break LABEL;
    ...
    if (cond) break LABEL;
    ...
}

Or this:
LABEL:
while (...) {
    while (...) {
        .... break LABEL;
    }
}

But did you know that all of the following are also valid Java
boolean c0 = ...;
L0: if (c0) { } 
if (c0) L1: { } 
if (c0) { } else L2: { } 
if (c0) { } else L3: if (c0) { } 

So, having accepted that, what does the following code fragment print?
if (true) L4:
    break L4;
else 
    System.out.println("First else clause");

L5: if (true)
    break L5;
else
    System.out.println("Second else clause");

For bonus marks:
1) What is the bytecode generated by a recent Sun javac? This might surprise you, given the design policies of Java, but it makes a lot of sense.
2) Under what circumstances will javac not do that? (This, every Java programmer ought to know.)


From jcpp, a pure Java implementation of the C preprocessor, digraph handling. This is a second pass, perhaps there's an even better way to do it. The first pass was the obvious block of conditionals, but I think this reads better.
            case '%':
                d = read(); 
                if (d == '=') 
                    tok = new Token(MOD_EQ);
                else if (d == '>') 
                    tok = new Token('}');   // digraph
                else if (d == ':') PASTE: {
                    d = read(); 
                    if (d != '%') {
                        unread(d);
                        tok = new Token('#');   // digraph
                        break PASTE;
                    }    
                    d = read(); 
                    if (d != ':') {
                        unread(d);
                        unread('%');
                        tok = new Token('#');   // digraph
                        break PASTE;
                    }       
                    tok = new Token(PASTE); // double-digraph
                }
                else
                    unread(d);
                break;



(Post a new comment)


[info]topbit
2008-05-07 06:42 pm UTC (link)
I think there is still "A Case Against the Goto Statement" - even if it's called a labelled break.

I tend to go for switches and some fancy (non-)loops - a regular favourite of mine is
do {
  if (cond1) break;
  if (cond2) break;
  if (cond3) {
    // change stuff
    // try again
    continue;
  }
  ...
} while (false);


You can break out of it early, or continue around the loop - or just drop out the end, and it usually avoids dropping into more than a couple levels of indentation. It should be easy enough to spin blocks off into separate functions as well, which will help keep it tidy, and understandable.

That inner "if(d==':')" could be easily cleaned up with a do/while(false) to avoid a 'break LABEL;'

Another big favourite, while I'm here, is reversing a conditional/constant check
if ('=' = d)
It may be less easy in a compiled language but in a scripting language it will abort hard and fast if you dropped an = sign (so "'z' = $x" break hard).

(Reply to this)(Thread)


[info]babysimon
2008-05-07 07:59 pm UTC (link)
Another big favourite, while I'm here, is reversing a conditional/constant check... It may be less easy in a compiled language

In Java the type system would catch this at compile time, regardless of which way round the test is.

(Reply to this)(Parent)


[info]shevek
2008-05-07 09:31 pm UTC (link)
I do this too, but in larger settings, I think it's fragile against maintenance. I think if you're going to refer to a remote point, say the beginning or end of a block you're in, you might as well say which one by name, that way when the "block you're directly in" changes accidentally, your code doesn't break.

(Reply to this)(Parent)


[info]deflatermouse
2008-05-07 06:57 pm UTC (link)
The second example (with the L4, L5 labels) compiles fine for me under 1.6.0 but fails to run with
Exception in thread "main" java.lang.NoClassDefFoundError: LabelTest (wrong name: com/sixapart/search/test/LabelTest)
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:620)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
        at java.net.URLClassLoader.access$000(URLClassLoader.java:56)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:276)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
        at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)


:(

(Reply to this)(Thread)


[info]deflatermouse
2008-05-07 08:49 pm UTC (link)
I tell a lie, I was being unutterably dumb.

Under 1.6.0 it compiles and runs fine (presuming that "printing nothing" is the expected result)

(Reply to this)(Parent)(Thread)


[info]shevek
2008-05-07 09:33 pm UTC (link)
Indeed. Did you disassemble it, and note that the strings to be printed don't even appear in the constant table in 1.6? (That's the first of the 'extra' questions).

(Reply to this)(Parent)


[info]babysimon
2008-05-07 08:17 pm UTC (link)
I'm not sure the L4, L5 example is the best one you could give, as the break statements don't output the result of the program.

I have to admit, whenever I see a labelled break or continue, I slow right down, thinking: "Right, whoever wrote this was trying to be clever. I'd better be on my guard..."

(Reply to this)(Thread)


[info]babysimon
2008-05-07 08:43 pm UTC (link)
I mean, affect the result. Or affect the output. Damn.

(Reply to this)(Parent)(Thread)


[info]shevek
2008-05-07 09:34 pm UTC (link)
Well, it set me thinking, because clearly if you break the 'if', the L4 case, you jump past all else blocks belonging to that 'if' (remembering the strict definition of 'if'). However, in the L4 case, I did wonder for a moment whether it would print something, then I realised of course it couldn't.

The other interesting point I noted (first of the "extra" questions) is that the strings don't even appear in the constant table.

(Reply to this)(Parent)(Thread)


[info]rhialto
2008-05-12 09:50 pm UTC (link)
but... but... but...

in the L4: case, the L4: doesn't label the if, it labels only the "break L4". So if the definition of "break" is "jump just past the end of the labeled statement", it will jump just past the end of "break L4", then jump around the else part as usual, and print nothing in that way.

With L5, the break would jump past the whole if statement, like you say.

(or did you mean L5 the first time where you typed L4, in paragraph 1?)

(Reply to this)(Parent)(Thread)


[info]shevek
2008-05-13 05:34 am UTC (link)
Having jumped past the 'break L4', the semantics of the if/else make it jump past the else as well. At which point the jump-threader kicks in and replaces two chained jumps with a single jump, producing the code you see. The unreachable block is then elided.

(Reply to this)(Parent)


[info]icklemichael
2008-05-08 01:25 pm UTC (link)
Has your example been simplified too far or have I misunderstood? Is the fact that if(false) doesn't generate (although is not required not to generate) any bytecode surprising in some way?

(Reply to this)(Thread)


[info]shevek
2008-05-08 01:32 pm UTC (link)
It's surprising because I believe this was not true in earlier JDKs. I'd have to check. Certainly, a static final boolean used as a condition variable does NOT (currently) cause the same effect as a literal boolean. Also, by design decision, all of the constant folding, jump threading, and so forth happens in the hotspot compiler, so it's a small surprise to see it ALSO done in the bytecode compiler - this has little performance impact.

This also has an impact on reachability analysis in static analyzers, which mostly drive themselves off the bytecode, so e.g. a new findbugs won't always discover nonreachable code using JDK 1.6.

(Reply to this)(Parent)(Thread)


[info]icklemichael
2008-05-08 02:16 pm UTC (link)
From 6 to 1.3 behave the same for me, and I don't ever recall using a sun javac which didn't, I think jikes used to leave it in (presumably to allow you to use reflection to change static final booleans).

Although you say that static final booleans behave differently which I've never noticed that they do, have you got an example?

(Reply to this)(Parent)(Thread)


[info]shevek
2008-05-08 02:37 pm UTC (link)
Now I think about it, all I did was a local "boolean c1 = true;" I didn't mark it final.

(Reply to this)(Parent)


Create an Account
Forgot your login?
Login w/ OpenID
English • Español • Deutsch • Русский…