Win port of sort (unix) Python, Tcl or another language with build-in sort msdos sort > -----Original Message----- > From: pic microcontroller discussion list > [mailto:PICLIST@MITVMA.MIT.EDU] On Behalf Of Rodrigo Valladares P. > Sent: Thursday, November 28, 2002 3:16 AM > To: PICLIST@MITVMA.MIT.EDU > Subject: Re: [OT]:Sorting large files > > > Hi, > > for a general discusion about "external" sorting, read vol. 3 > of the art > of computer programming, searching and sorting (Donald > Knuth), i have to > do something like this years ago, don't remember exactly what > was done, > i think it was a variant of mergesort: > > - split the file in chunks that can be sorted in memory using a fast > algorithm, quicksort by example, > - use a heap to read the first elements of n files opened > (not necesary > all the sorted chunks) and write the lesser, and read and insert into > the heap an element of the chunk from that the element was moved to > output. Continue reading all the data until all the chunks > are combined > into a large file that is sorted. > - take other chunks or combined files and apply the same algorithm, > until there is one big file, that is sorted. > > if you have a original file of n lines, and you can a first split of m > lines, then you have n/m chunks, and the time for the first phase sort > is O(n/m * m*log(m)). (quicksort n/m times). > then you have to recombine chunks, and for a chunk of size k, i think > that O(k + log(k)) ???, the exact order is left as an excercise :-) > > there are far more efficient agorithms, that i don't remember and i > don't use now, The above was done in sco unix, and i know > there are some > specialized software for sort large file under unix (sorry, my head... > don't remember the name), and for windozes, mmmm, i will try > to move the > problem to another more stable OS. > > > RVP. > > > > Brendan Moran wrote: > > > Does anyone know of an algorithm for sorting extremely > large files of > > linebreak-separated, case-sensitive ASCII values? > > > > I have a file that is 3,480,231kB of unsorted ASCII values, > and I need it > > sorted... and, yes, I do have some idea of the time that > this will take. > > > > I'm using WinXP for this, so win32 compatible utilities are > a good thing > > for me. > > > > Thanks > > > > --Brendan > > > > -- > > http://www.piclist.com hint: To leave the PICList > > mailto:piclist-unsubscribe-request@mitvma.mit.edu > > > > > > > > -- > http://www.piclist.com hint: To leave the PICList > mailto:piclist-unsubscribe-request@mitvma.mit.edu > > > -- http://www.piclist.com hint: PICList Posts must start with ONE topic: [PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads