10 Java File I/O Techniques That Cut Processing Time From Hours to Seconds
Learn 10 proven Java file I/O techniques to speed up file processing — from buffered streams to memory-mapped files. Optimize your Java I/O performance today.
I remember the day I first realized how slow file I/O could be. I had written a simple log parser that read a 5-gigabyte file line by line using FileReader without buffering. The process took almost forty minutes. Forty minutes of watching a cursor blink. That experience taught me a lesson I never forgot: the way you read and write files in Java can make the difference between a tool that works and a tool that works at the speed of thought. Over the years I collected a set of techniques that I keep in my back pocket for every file‑processing job. Let me walk you through ten of them, the ones I use most often, and show you exactly how to apply them.
Start with the simplest but most effective improvement: wrap your streams in buffers. I cannot count how many times I have seen code like this:
FileInputStream fis = new FileInputStream("data.bin");
int b;
while ((b = fis.read()) != -1) {
// do something
}
fis.close();
Every single call to read() goes straight to the operating system. That is expensive. The JVM makes a system call for each byte. On a modern SSD, that might still be thousands of calls per second, but it is nowhere near what the hardware can deliver. By wrapping the stream in a BufferedInputStream, you let the JVM read a large chunk at once and hand you bytes from an internal array.
try (BufferedInputStream bis = new BufferedInputStream(new FileInputStream("data.bin"))) {
int b;
while ((b = bis.read()) != -1) {
// process byte
}
}
The default buffer size is 8192 bytes. For large files I often increase it to 65536 or even 262144 to match the file system block size. You can pass the size in the constructor. The effect is dramatic. On my old laptop, reading that same 5‑gigabyte log with a 64KB buffer took under three minutes. The same principle applies to readers:
try (BufferedReader reader = new BufferedReader(new FileReader("input.txt"), 65536)) {
String line;
while ((line = reader.readLine()) != null) {
// process line
}
}
Use BufferedReader for text files. Always. It is the single cheapest optimisation you can make.
For small files, the story is different. A configuration file, a short JSON response, a list of a hundred user names – these do not need streams. Java’s java.nio.file.Files class gives you a one‑shot method that reads the entire file into a List<String>:
List<String> lines = Files.readAllLines(Paths.get("config.properties"), StandardCharsets.UTF_8);
This is clean, short, and safe. The JVM closes everything automatically. I use this for any file I know fits comfortably in memory – say, under ten megabytes. Beyond that, you risk an out‑of‑memory error or at least a long garbage collection pause. The rule of thumb I follow: if I can open it in Notepad without the computer sweating, readAllLines is fine.
What about files that are huge, say a 50‑gigabyte binary file with a fixed record structure? Reading that with a buffered stream is possible, but if you need random access – skip to position 4GB, read a header, then jump back – the buffered approach falls apart. That is where memory‑mapped files shine. The MappedByteBuffer lets you map a region of the file directly into the virtual address space. The operating system pages the data in and out as you access it, and you read it like a plain ByteBuffer. No system calls for every byte. No copying into user space.
try (RandomAccessFile file = new RandomAccessFile("largefile.bin", "r");
FileChannel channel = file.getChannel()) {
MappedByteBuffer buffer = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
while (buffer.hasRemaining()) {
byte b = buffer.get();
// analyze bytes
}
}
Notice I used RandomAccessFile to get the channel. For read‑only mapping, this is the simplest way. The buffer behaves like a direct ByteBuffer, meaning it lives outside the garbage‑collected heap. That reduces overhead. When I worked on a binary format parser for satellite telemetry, this technique cut processing time from hours to minutes. The file was 120 gigabytes, and we only ever accessed about 200 megabytes of it. The operating system handled the rest.
But memory‑mapping is not always the answer. For sequential line‑by‑line processing of huge text files, I prefer the Stream API that Files.lines gives you. It opens a BufferedReader under the hood and returns a Stream<String> that reads lazily. You can filter, map, and collect without loading everything into memory.
try (Stream<String> stream = Files.lines(Paths.get("transactions.csv"))) {
List<Transaction> highValues = stream
.skip(1) // header
.map(Transaction::parse)
.filter(t -> t.amount().compareTo(BigDecimal.valueOf(1000)) > 0)
.collect(Collectors.toList());
}
The stream is closed automatically when the try‑with‑resources block ends. I use this for CSV parsing, log analysis, and any job where I need to transform data on the fly. The code is declarative, easy to read, and efficient because the stream does not buffer the entire file.
Writing files also benefits from buffering. A common mistake is to flush or close the writer after every line, especially when generating reports. That forces a system call per line. Instead, use a BufferedWriter with a reasonable buffer size and flush only once or twice.
try (BufferedWriter writer = Files.newBufferedWriter(Paths.get("output.log"),
StandardOpenOption.CREATE, StandardOpenOption.APPEND)) {
for (String line : newLines) {
writer.write(line);
writer.newLine();
}
// flush happens automatically on close
}
If you are writing many small records, consider increasing the buffer to 128KB or more. For high‑throughput logging, I would not write directly at all – use a library like Log4j with an asynchronous appender. But for simple batch exports, BufferedWriter is reliable and fast.
What about non‑blocking I/O? Java’s AsynchronousFileChannel lets you start a read or write and get a Future or a CompletionHandler that fires when the operation finishes. I do not use this every day, but when I have many concurrent file operations – a file server, a batch processor that reads thousands of small files – asynchronous channels keep the thread count low.
AsynchronousFileChannel channel = AsynchronousFileChannel.open(Paths.get("data.bin"),
StandardOpenOption.READ);
ByteBuffer buffer = ByteBuffer.allocate(4096);
Future<Integer> result = channel.read(buffer, 0);
// Do other work while reading...
int bytesRead = result.get(); // blocks only when you need the data
For writing, the pattern is similar. The real power comes when you combine it with a CompletionHandler so you never block at all. But be honest with yourself: if you are reading files one at a time in a single thread, asynchronous channels add complexity without benefit. Use them only when you need concurrency without thread‑per‑operation.
One of the most elegant features in NIO is transferTo(). I call it the zero‑copy copy. It tells the operating system to move data directly from one file descriptor to another without passing through application memory. It is the fastest way to copy a file on the same file system.
try (FileChannel source = FileChannel.open(sourcePath, StandardOpenOption.READ);
FileChannel dest = FileChannel.open(destPath, StandardOpenOption.CREATE_NEW, StandardOpenOption.WRITE)) {
long position = 0;
long size = source.size();
while (position < size) {
position += source.transferTo(position, size - position, dest);
}
}
The loop is necessary because transferTo may not copy the entire file in one call. Internally, it delegates to the operating system’s sendfile or equivalent. I use this to copy backup files, serve static assets in web apps, and even duplicate large databases during tests. It is borderline magical.
Binary data requires more care. For structured binary files – headers, records, packed integers – you can use DataInputStream with a buffered stream, or use ByteBuffer from NIO. I prefer the ByteBuffer approach because it gives me control over endianness and position.
// Old way: DataInputStream
try (DataInputStream dis = new DataInputStream(new BufferedInputStream(new FileInputStream("data.bin")))) {
int magic = dis.readInt();
short version = dis.readShort();
}
// New way: ByteBuffer
try (FileChannel channel = FileChannel.open(path, StandardOpenOption.READ)) {
ByteBuffer buf = ByteBuffer.allocate(1024);
channel.read(buf);
buf.flip();
int magic = buf.getInt();
short version = buf.getShort();
}
When I parse network packets stored in files, I almost always use ByteBuffer with order(ByteOrder.LITTLE_ENDIAN). It makes the code self‑documenting and avoids surprises when reading cross‑platform data. For extremely large binary files, I memory‑map only the region that contains the headers and parse on the fly.
Temporary files are a detail many developers forget. They create files in the current directory, or in /tmp, and never clean them up. Java gives you Files.createTempFile which places the file in the system’s temporary directory and ensures a unique name. Always clean up after yourself.
Path tempFile = Files.createTempFile("pref", ".tmp");
try {
Files.writeString(tempFile, "temporary data");
// process with a FileChannel or reader
} finally {
Files.deleteIfExists(tempFile);
}
You can also set the DELETE_ON_CLOSE option when opening the file, but I prefer explicit cleanup because it works even if the JVM does not exit normally. For multiple temporary files, Files.createTempDirectory gives you a directory you can delete recursively.
Finally, measure. I cannot stress this enough. All the techniques in the world are worthless if you apply them to the wrong bottleneck. I use JMH (Java Microbenchmark Harness) to compare approaches with realistic workloads.
@Benchmark
public int readWithBufferedReader(Blackhole bh) throws IOException {
try (BufferedReader reader = new BufferedReader(new FileReader("test.txt"))) {
String line;
int count = 0;
while ((line = reader.readLine()) != null) { count++; }
bh.consume(count);
return count;
}
}
@Benchmark
public long readWithMappedBuffer(Blackhole bh) throws IOException {
try (FileChannel channel = FileChannel.open(Paths.get("test.txt"), StandardOpenOption.READ)) {
MappedByteBuffer buffer = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
long count = 0;
while (buffer.hasRemaining()) { buffer.get(); count++; }
bh.consume(count);
return count;
}
}
Run these benchmarks on the same hardware you use in production, with files that resemble your real data. I have seen cases where memory‑mapping was actually slower than a simple buffered stream because the file was small and the OS overhead of mapping outweighed the savings. Only numbers can tell you what works.
These ten techniques form the core of my file‑processing toolkit. I start simple – buffered streams, Files.lines – and reach for NIO channels and memory‑mapped files only when the simple approach chokes. The secret is to keep a mental map of trade‑offs: memory vs speed, complexity vs maintainability. With each file you process, ask yourself how big it is, how often you access it, and whether random access matters. Then pick the tool that fits.
I still remember that forty‑minute log parser. Now I can parse the same file in under a minute using a buffered stream with a proper buffer size. The difference is not just in code – it is in the confidence that the tool will finish before I have time to get tea. That confidence is what high‑performance file I/O gives you.