java - Is there any efficient method to find the first byte of all instances of a particular 4 byte block in a file? -


i have files contains archived binary messages. small file around 600mb , contains 9000 messages. each message begins particular 4 byte flag know, indicates first 4 bytes of message header (and such must captured). message header fixed size messages. message header followed payload of size identified in header. once i've found start of particular message header, know how many bytes end of header , can use extract number of bytes in message need parse archive file , isolate each message processing, making sure include bytes first byte of 4 byte flag end of specified message length. there padding between messages varies.

due size of file, don't want (and can't in cases) consume file single array. therefore, i'm looking @ things randomaccessfile , fileinputstream. doesn't seem it's simple task scan file particular sequence of bytes , take every byte first byte in sequence through known length. randomaccessfile, read(byte[]) , seek() methods seem allow me implement solution.

to give idea, current implementation involves method called findflag() takes start position in randomaccessfile. seeks position , reads 4 bytes starting there. if finds flag, returns startpos. otherwise, calls recursively, moving startpos + 1 , repeats until finds flag. since know last byte read part of data message, start seeking there:

file.seek(startpos);  byte[] possibleflag = new byte[4];  file.read(possibleflag, 0, possibleflag.length);  if (arrays.equals(byteutils.inttobytes(message.flag), possibleflag)) {     return startpos; } else {     return findflag(startpos + 1); } 

am overlooking something, either in java (java 6 or earlier) or in well-tested external library (such apache library or similar)? if not, there better solutions dealing binary data in java or approaches particularly well-suited problem?

scan through file using java.nio.channels.filechannel uses less intermediate copies map file memory. benchmark of alternatives


Comments

Popular posts from this blog

jquery - Invalid Assignment Left-Hand Side -

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -