JAVA : read and write a file together

Asked 22/11, 2010 at 22:57 Answered 18/8, 2013 at 17:26

I am trying to read a java file and modify it simultaneously. This is what I need to do : My file is of the format :

aaa
bbb
aaa
ccc
ddd
ddd

I need to read through the file and get the count of the # of occurrences and modify the duplicates to get the following file:

aaa -  2
bbb -  1
ccc -  1
ddd -  2

I tried using the RandomAccessFile to do this, but couldn't do it. Can somebody help me out with the code for this one?

Fibula answered 22/11, 2010 at 22:57 Comment(0)

It's far easier if you don't do two things at the same time. The best way is to run through the entire file, count all the occurrences of each string in a hash and then write out all the results into another file. Then if you need to, move the new file over the old one.

You never want to read and write to the same file at the same time. Your offsets within the file will shift everytime you make a write and the read cursor will not keep track of that.

Chamois answered 22/11, 2010 at 23:15 Comment(3)

This is my thought as well, it just took me too long to type it out with work getting in the way! – Vania 22/11, 2010 at 23:28

well the problem is that the the file I have is way too large. Keeping it in memory just wont work.. And therefore, hastables are a bad idea.. I have no choice but to resort to file operations :( Bad idea, but have no other go.. – Fibula 23/11, 2010 at 0:20

How long is the longest string? You could use a trie. It would take a little less space and if you have lots of overlap in terms, it would take a lot less space. Worst case, you really should use a database, rather than essentially writing your own. – Chamois 24/11, 2010 at 23:51

I'd do it this way: - Parse the original file and save all entries into a new file. Use fixed length data blocks to write entries to the new file (so, say your longest string is 10 bytes long, take 10 + x as block length, x is for the extra info you want to save along the entries. So the 10th entry in the file would be at byte position 10*(10+x)). You'd also have to know the number of entries to create the (so the file size would noOfEntries*blocklength, use a RandomAccesFile and setLength to set the this file length). - Now use quicksort algorithm to sort the entries in the file (my idea is to have a sorted file in the end which makes things far easier and faster finally. Hashing would theoretically work too, but you'd have to deal with rearranging duplicate entries then to have all duplicates grouped - not really a choice here). - Parse the file with the now sorted entries. Save a pointer to the entry of the first occurence of a entry. Increment the number of duplicates until there is a new entry. Change the first entry and add that additonal info you want to have there into a new "final result" file. Continue this way with all remaining entries in the sorted file.

Conclusions: I think this should be a reasonably fast and use reasonable amount of resources. However, it depends on the data you have. If you have a very large number of duplicates, quicksort performance will degrade. Also, if your longest data entry is way longer than the average, it will also waste file space.

Dhole answered 22/2, 2011 at 14:36 Comment(0)

If you have to, there are ways you can manipulate the same file and update the counters, without having to open another file or keep everything in memory. However, the simplest of the approaches would be very slow.

Effector answered 23/11, 2010 at 0:44 Comment(0)

-2

 import java.util.*;
 import java.io.*;
 import java.util.*;
 class WordFrequencyCountTest
 {
 public static void main( String args[])
 {
System.out.println(" enter the file name");
Scanner sc = new Scanner(System.in);
String fname= sc.next();    
     File f1 = new File(fname);


    if(!f1.exists())
    {
        System.out.println(" Source file doesnot exists");
        System.exit(0);
    }
    else{
        try{                
            FileReader fis = new FileReader(f1);
            BufferedReader br = new BufferedReader(fis);
            String str = "";
            int count=0;  
        Map<String, Integer> map = new TreeMap<String, Integer>(); 
            while((str = br.readLine()) != null )
            {
                String[] strArray = str.split("\\s");
                count=1;
                for(String token : strArray)   // iteration of strArray []
                {                       
                if(map.get(token)!=null )
            {
                        count=map.get(token);
                        count++;
                        map.put(token, count);
                        count=1;
                    }else{
                        map.put(token, count);

                    }
                }
            }

            Set set=map.entrySet();
            Iterator itr = set.iterator();    
            System.out.println("========");

            while(itr.hasNext())
            {
                Map.Entry entry = (Map.Entry)itr.next();

                System.out.println( entry.getKey()+ " "+entry.getValue());
            }               
            fis.close();            
        }catch(Exception e){}
           }
        }
    }

Manufacture answered 18/8, 2013 at 17:26 Comment(0)

Recommended topics

Hot tags