Discussion Thread
Data Extractor
Message Thread
Extract any data, including email addresses and URLs from your files and webpages.
Posted in the Data Extractor Forum.
Batch processing
When doing large amounts of data extracting the program can eventually slow to a crawl, would it be possible in a future version to have an option that automatically saves extracted data after x results and then clears the memory and carries on. Two options for saving would be useful, save as sequential files e.g.
extract 1.txt
extract 2.txt
etc.
or append the results to a single file.
Alternativly is there a way of doing this with Javascript ?
Batch processing
Well that's certainly a good idea for the future. I'd say for now that your best bet would be to break your extractions into seperate parts themselves.
Batch processing
Thanks, I'm currently breaking it into chunks but it wastes a lot of processing time as if a chunk finishes at ,for example, midnight I won't be around till 9am to load the next chunk, it would be nice to have something that can be left to run un-attended
Batch processing
Was there ever any progress with this? Another idea that would be useful would be an option to detect and remove duplicate urls etc. on the fly rather than after extraction however I'm not too sure whether this would speed things up or actually slow them down? In the absence of this maybe a "blacklist" of url's that you are not interested in extracting (including partial url's) so a blacklist entry of e.g. "
http://www.bbc.co.uk/*" (note the * as a wildcard) would ignore anything from the BBC website. Of course Javascript rules can help but it may be useful for for novice users.
Batch processing
You could do this with a custom javascript rule, but as for now we have not changed the app to add this.
Batch processing
Pity it hasn't been updated. AFAIKT Batch processing can't be done with Javascript as disk access is blocked for security. Unfortunately because of this Data Extractor is effectively useless to me as it slows to a crawl after a few hundred results (wish I'd realised this before I bought it :-((( ) Batch processing and the ability to switch off the preview window or at least disable image downloading would make this a killer app but looks like I'll have to investigate writing my own solution !
Batch processing
Martin, if you disable inamge downloads in the Internet Explorer options window that it will also disable it in Data Extractor. I hope this helps.
Batch processing
I have disabled images in IE but Data Extractor is still displaying images!
Batch processing
That's very strange, I'd try a restart after changing the settings
Batch processing
A restart didn't help however I've now found a work around and my extracting is now lightning fast :-) I download all the pages of interest to my local hard drive first (which strips images etc. automatically) and then extract from there. Data extractor is useful once again !