Extract any data, including email addresses and URLs from your files and webpages.
Posted in the Data Extractor Forum.
When doing large amounts of data extracting the program can eventually slow to a crawl, would it be possible in a future version to have an option that automatically saves extracted data after x results and then clears the memory and carries on. Two options for saving would be useful, save as sequential files e.g.
or append the results to a single file.
Well that's certainly a good idea for the future. I'd say for now that your best bet would be to break your extractions into seperate parts themselves.
Thanks, I'm currently breaking it into chunks but it wastes a lot of processing time as if a chunk finishes at ,for example, midnight I won't be around till 9am to load the next chunk, it would be nice to have something that can be left to run un-attended
Was there ever any progress with this? Another idea that would be useful would be an option to detect and remove duplicate urls etc. on the fly rather than after extraction however I'm not too sure whether this would speed things up or actually slow them down? In the absence of this maybe a "blacklist" of url's that you are not interested in extracting (including partial url's) so a blacklist entry of e.g. "http://www.bbc.co.uk/
Martin, if you disable inamge downloads in the Internet Explorer options window that it will also disable it in Data Extractor. I hope this helps.
I have disabled images in IE but Data Extractor is still displaying images!
That's very strange, I'd try a restart after changing the settings
A restart didn't help however I've now found a work around and my extracting is now lightning fast :-) I download all the pages of interest to my local hard drive first (which strips images etc. automatically) and then extract from there. Data extractor is useful once again !