How Codebase Works


Without a doubt, the source of most hassles facing new Jini and RMI programmers is improper use of codebase. Codebase problems can manifest themselves in many ways, including programs that fail silently and programs that run in some circumstances but fail in others. In this page I'll talk about some of the common misconceptions and problems I've found in using codebase, as well as some solutions.


Background

    When you pass an object into or out of a remote method call, the object is serialized--turned into a stream of bytes--for transmission, and reconstituted at the other end of the wire. This behavior makes it possible for practically any arbitrary Java object to be exchanged between two Java Virtual Machines, with little or no direct programmer involvement.

    But serialization is only part of the story. When you serialize an object, only the member data within that object is written to the byte stream; not the code that actually implements the object.

    This is one of the common misunderstandings of beginning RMI programmers. While RMI is touted for its ability to transmit completely new objects to a Java program that may have never heard of this object before, the RMI protocol itself doesn't do the transmission of the code. Instead, RMI provides the basic facilities to allow the receiver of an object to fetch the code for the object's class, if it's never seen it before and doesn't have it available locally.

What is Codebase?
    Codebase is, quite simply, how programs that use RMI's semantics of remote class loading find new classes. When the sender of an object serializes that object for transmission to another JVM, it annotates the serialized stream of bytes with information called the codebase. This information tells the receiver where the implementation of this object can be found.

    You should make sure you understand the distinction here between who sets and who uses the codebase information. Any program that thinks it might ever pass an object to another program that may not have seen it before must set the codebase, so that the receiver can know where to download the code from, if it doesn't have the code available locally. The receiver will, upon deserializing the object, fetch the codebase from it and load the code from that location.

    The actual information stored in the codebase annotation is a list of URLs from which the classfile for the needed object can be downloaded.

    If you don't set codebase, then you can't pass an object to any program that doesn't have that object's classfile already available locally.

How is codebase set?
    In most cases, when you run a program that may send new objects to a receiver (in other words, it "exports" downloadable code), you will set the codebase via a property on the command line. For example, if you're running a web server on machine poobah on port 8080, you would make sure that any needed classfiles are installed under that webserver's root directory, and pass the following on the Java command line when running your application:
        -Djava.rmi.server.codebase=http://poobah:8080/
    
    This tells any program that receives an object of a class it has never seen from your program to download it from the specified URL.
Where should downloadable code live?
    When a receiving program attempts to load the code from the poobah webserver, as above, it will turn the class name for the object into a URL and try to load it relative to the codebase. So, for instance, if you've passed an instance of the class yourcorp.project.GraphBean to a receiver using the codebase http://poobah:8080/ above, the receiver will try to load this code from the URL:
        http://poobah:8080/yourcorp/project/GraphBean.class
    
    Make sure that you place your classfiles under the appropriate directory in the web server's "root" directory, so they can be properly found.

    Alternatively, you may wrap all of the classfiles you need to export into a JAR file, and set the codebase URL so that it explicitly points to this JAR. For example, if you've bundled all of the code that may need to be downloaded by clients into a file called MyClasses.jar, then you could use the following codebase:

        -Djava.rmi.server.codebase=http://poobah:8080/MyClasses.jar
    
    Be sure you note the subtle differences in syntax between these two versions! On the first example, the codebase refers to a directory, and must end in a slash! In the second, the codebase explicitly refers to a file and should not end in a slash.
Codebase and security
    One additional thing to watch out for is how downloadable code interacts with Java's security mechanisms. The codebase property is, in essence, a way for one program to dynamically extend another's CLASSPATH at runtime. Clearly there are security implications to doing this--you don't want arbitrary programs that you might be talking to causing you to load random code.

    So Java disables all remote classloading unless a security manager has been installed. Without a security manager, a Java application will only load classes locally, and will disregard any codebase information that it receives from other programs.

    Most RMI programs will need to be able to download the stubs for remote objects they talk to, and all Jini programs need to be able to download service proxies. So if you don't install a security manager you will effectively break all of these programs. (Although, see below for details on how this can appear to work, even when it doesn't.)

    The moral here is to install a security manager whenever you might be dealing with remotely-loaded code, even though security managers can bring their own hassles.

How codebase can screw you
    Many of the problems with codebase arise because programs that often appear to work actually hide subtle problems. Most of us try to set up our development environments to be as convenient as possible. This often means that, when developing distributed applications, we work on both the client side and the server side of our applications in the same directory trees.

    This is where problems start to arise. If you don't correctly use codebase, there's still a good chance that your programs will work while you're developing them: any code needed by either a client or server will simply be loaded locally from the CLASSPATH. This may happen because you've forgotten to install a security manager (which disables remote class loading), or because you neglected to set the codebase property, or because the needed classfiles aren't available on a web server.

    Regardless of how it happens, the end result is that code that should be loaded remotely actually gets loaded locally from the CLASSPATH. So while your programs may work in your development directory, they're likely to break once you actually deploy them.

    I've got a page that talks about how to set up your environment to "simulate" a multimachine environment. While the guidelines there can be a bit of a hassle, they definitely keep problems from lurking unnoticed until deployment time. See Developing for Deployment for more information.

Common codebase gotchas
    Beyond problems surrounding a basic misunderstanding of how codebase works, there are a number of other common but minor problems that often affect Jini and RMI developers:

    Specifying multiple URLs in a codebase

      The proper way to separate multiple URLs in a single codebase is with spaces. On most systems, you'll have to quote the value of the codebase property so that it won't be interpreted as multiple parameters. For example:
           -Djava.rmi.server.codebase="http://poobah:8080/ http://poobah:8080/MyClasses.jar"
       

    Obeying proper URL syntax

      Remember that if you're naming a directory in a codebase URL, the URL must end with a slash. If you're naming a JAR file, then the codebase must not end with a slash.

    Never use file URLs

      If you don't have a webserver handy, or don't want to take the trouble to start one just for development and debugging, you may be tempted to just use a file: URL as a codebase. While this will work, it's usually not a good idea.

      If a server passes a file codebase to a client, the client will try to load any needed code from its own local filesystem. If you're developing and testing both the client and the server on the same machine (or two closely networked machines with the same filesystem) then this may work--since the class files will live in the same place for both the client and the server. But if you ever run the programs on separate machines that do not share a filesystem, your code will suddenly break.

    Never use localhost in a codebase URL

      "Localhost" is used to refer to the current host. so it should be clear why this is a bad idea. If a server sets the codebase to a URL containing localhost, the client will evaluate this codebase and attempt to load the code from its system rather than the server's. Again, this situation may deceptively work if you're testing the client and the server on the same machine. But your code will break once deployed on different machines.
Does RMI really require me to run a webserver?
    You may be thinking that having to have a separate web server available on a network just to do RMI or Jini development is a big hassle. There are actually a few reasons separating code downloading from the RMI protocol proper:

    • First, although Jini supports RMI's code loading semantics (meaning that code will by remotely loaded via codebase properties attached to serialized object streams), there is no requirement for Jini services to actually be implemented using RMI. So by divorcing the code loading protocol from the communications protocol, Jini services built using raw sockets (or whatever) can still remotely load code.
    • By using codebase URLs, you have a lot of flexibility over where downloadable code is stored and administered. So, for example, you might have a big corporate web server with all your downloadable code on it. Or individual development teams might have their own webservers. The approach taken by RMI allows a lot of flexibility.
    • Finally, and most importantly, Java's security primitives are (as of this release anyway) largely based on where code originated from. Security policy files grant permissions based on where code was loaded, and location is specified using URLs. So the RMI scheme fits in nicely with Java's security mechanisms.

    Of course, for lots of common uses, requiring a separate web server is a big hassle, especially for quick development and testing.

    Members of the RMI team developed the classserver package as a demonstration of a way to build minimal code exporting into any Java program. Using this package, you can instantiate a very minimal HTTP server (basically just smart enough to export classfiles) into your RMI server programs. Once running, your program can programmatically install a codebase property that points to the HTTP server running inside itself. Presto! No more hassles with putting code on a web server, although your RMI servers do get slightly more porky with this approach.

    The classserver code is a demonstration only, and is copyright Sun Microsystems. You can download it from HERE.

    Update: Here's another very cool solution. Chris Coy has developed a framework that uses multicast as a way to pick up URLs for code servers, along with a JAR indexer that lets you easily and transparently serve up classfiles without any configuration at all. Check it out HERE.

Go back to Jini Planet

Keith Edwards
kedwards@kedwards.com


Copyright 1999, W. Keith Edwards