Load Testing Java Applications Before JMeter

JMeter was open-sourced in 1999. Before that, load testing a Java application meant writing a test harness yourself, choosing what to measure, and interpreting raw output without dashboards.

At Motorola we needed to know how many devices the NMS could poll simultaneously before response times degraded. We built a load harness in Java that simulated device responses and measured the poller under increasing concurrency.

What We Were Testing

The NMS polled devices via SNMP every 30 seconds. Each poll opened a UDP socket, sent a GET request, waited for a response, and updated the device registry. The questions we needed to answer:

How many concurrent polls could the system handle before the 30-second cycle time was missed?
What happened to memory and CPU at 200, 400, 600 simulated devices?
Where was the bottleneck — network I/O, CPU, the device registry lock?

The Simulated Device

We wrote a UDP server that impersonated SNMP devices at configurable latencies:

public class SimulatedSnmpAgent implements Runnable {
    private final DatagramSocket socket;
    private final int            delayMs;

    public SimulatedSnmpAgent(int port, int delayMs) throws SocketException {
        this.socket  = new DatagramSocket(port);
        this.delayMs = delayMs;
    }

    public void run() {
        byte[] buf = new byte[512];
        while (true) {
            try {
                DatagramPacket req = new DatagramPacket(buf, buf.length);
                socket.receive(req);
                if (delayMs > 0) Thread.sleep(delayMs);
                // Echo a minimal valid SNMP response
                byte[] response = buildSnmpResponse(req.getData());
                DatagramPacket resp = new DatagramPacket(
                    response, response.length, req.getAddress(), req.getPort());
                socket.send(resp);
            } catch (Exception e) {
                // log and continue
            }
        }
    }
}

We ran 500 of these on a separate host, each on a different port, responding with synthetic sysUpTime and sysDescr values.

The Load Driver

The driver spun up the NMS with a synthetic device list and measured cycle times:

public class LoadDriver {
    public static void main(String[] args) throws Exception {
        int deviceCount = Integer.parseInt(args[0]);
        int cycles      = Integer.parseInt(args[1]);

        DeviceRegistry registry = buildRegistry(deviceCount);
        Poller         poller   = new Poller(registry);

        long[] cycleTimes = new long[cycles];
        for (int c = 0; c < cycles; c++) {
            long start = System.currentTimeMillis();
            poller.runCycle(); // polls all devices, waits for completion
            cycleTimes[c] = System.currentTimeMillis() - start;
            System.out.printf("Cycle %d: %dms%n", c, cycleTimes[c]);
        }

        printStats(cycleTimes);
    }

    static void printStats(long[] times) {
        long sum = 0, max = 0;
        for (long t : times) { sum += t; if (t > max) max = t; }
        Arrays.sort(times);
        System.out.printf("Mean: %dms  P95: %dms  Max: %dms%n",
            sum / times.length,
            times[(int)(times.length * 0.95)],
            max);
    }
}

What We Found

At 200 devices: mean cycle time 4.2 seconds, P95 6.1 seconds. Well within the 30-second budget.

At 400 devices: mean 12.4 seconds, P95 18.9 seconds. Acceptable.

At 600 devices: mean 34.1 seconds — we had exceeded the cycle budget. P95 was 48 seconds.

The bottleneck was not CPU or network. It was the synchronized lock on DeviceRegistry.update(). Every completed poll acquired the lock to write its result. At high concurrency, threads spent more time waiting for the lock than doing work.

The fix: replace the single lock with per-device locks (a ConcurrentHashMap equivalent, manually implemented — java.util.concurrent did not exist yet). Mean cycle time at 600 devices dropped to 8.3 seconds.

What This Approach Lacks vs Modern Tools

JMeter, Gatling, and k6 give you parameterised load profiles, real-time graphs, automatic percentile calculation, and protocols (HTTP, JDBC, JMS) out of the box. Our harness had none of that.

What writing your own harness forced: you had to understand exactly what you were testing, what you were measuring, and what the numbers meant. The tooling did not hide the measurement behind a dashboard. When a modern load test produces unexpected results, understanding what the tool is actually doing — which our harness made unavoidable — is what lets you interpret them correctly.