ICONICO

Discussion Thread

Data Extractor

Message Thread

For WindowsData Extractor

Extract any data, including email addresses and URLs from your files and webpages.

Posted in the Data Extractor Forum.




Stack Overflow Problem

Hello,

I have recently started using Data Extractor for extracting certain text from web pages, however I keep getting an error whenever I use JavaScript to extract text; "Stack Overflow at line: 0".

Any help would be appreciated.


Regards
by Bhavya Anand on Mar 20 2009 6:13am Reply

Stack Overflow Problem

I'd try a reinstall first. Then let me know.
by Nico Westerdale on Mar 20 2009 9:23am Reply

Stack Overflow Problem

Hello Nick,

I followed your advice and reinstalled the application, but the problem persists. Could you throw some light?

Regards
by Bhavya Anand on Mar 26 2009 11:40am Reply

Stack Overflow Problem

That's very strange, could you tell me extactly your settings on all tabs?
by Nico Westerdale on Mar 26 2009 11:59am Reply

Stack Overflow Problem

Well, I use Data Extractor on a cached web page, and apply a JavaScript to extract the data. Do you want to see the JavaScript?
by Bhavya Anand on Apr 6 2009 7:45am Reply

Stack Overflow Problem

Without having all the settings there's not much I can do!
by Nico Westerdale on Apr 6 2009 10:04am Reply

Stack Overflow Problem

Ok, These are the settings on all my tabs:

1. Where to Extract - Extract from Multiple Files and/or Web page URLs
Here I use a cached web page from a 3rd party web crawler.

2. What to Extract - I made a custom rule called "Extract Info", where I used some JavaScript code I found on the internet and wrote my own code too

It is as follows, if u want a look :

var tables = document.all.tags('TABLE');
var rows;
var cells;
var timeExp = /([0-1][0-9]|2[0-3]):([0-5][0-9])/;
var dateExp = /[0-3][0-9]-(Jan|Feb|Mrz|Apr|Mai|Jun|Jul|Aug|Sep|Okt|Nov|Dez)-[0-9][0-9][0-9][0-9]/g;
var counter = 0;
var time, date;
if (tables)
{
for (var t=0; ttables.length; t++)
{
rows = tables[t].all.tags('TR');
if (tables[t].all.tags('TABLE').length == 0)
{
for (var r=0; rrows.length; r++)
{
if (rows[r].innerText != '')
{
cells = rows[r].all.tags('TD');
DataExtractor.StartNewResult();

var text = rows[r].innerText
DataExtractor.SetColumns(3);

if (text.match("Seramis"))
{
DataExtractor.AddResult(1, text);
time = rows[r-1].innerText.match(timeExp);
date = rows[r-1].innerText.match(dateExp);
DataExtractor.AddResult(2,time);
DataExtractor.AddResult(3,date);
}
}
}
}
}
}

Then when I click start extraction I get the error "Stack Overflow at line: 0". I also noticed in the results that when I extract data between a large div tag which contains in itself more div tags, then along with the text these div tags are also extracted and when I export the results in an excel file, I see little square boxes in the places where the div tags would have been. Any thoughts on this as well?
by Bhavya Anand on Apr 7 2009 6:25am Reply

Stack Overflow Problem

Again without the URL there isn't much I can do. What happens when you try using a simpler rule on the page?
by Nico Westerdale on Apr 7 2009 11:58am Reply

Stack Overflow Problem

The URL is : http://www.tropenland.at/...p?TID=8844.

I did notice one thing, the JavaScript works on the URL and it doesn't give the error, it only happens when I use the cached web page from a third party crawler.
However the special characters that I wrote about in my previous post exist in both cases. Any help would be appreciated.
by Bhavya Anand on Apr 17 2009 9:29am Reply

Stack Overflow Problem

Well your answer clearly lies in the difference between the live and crawler pages.
by Nico Westerdale on Apr 17 2009 10:39am Reply

Stack Overflow Problem

I agree on that issue, however the point about there being special characters in the output is the same whether I use live or crawler pages. If you were to try it yourself, you would see what I am talking about.

These characters make the output look messy, I have tried several URL's but the problem still persists.
by Bhavya Anand on Apr 21 2009 5:24am Reply

Stack Overflow Problem

Then I suggest you use a different crawler as that's what's causing the problem!
by Nico Westerdale on Apr 21 2009 9:43am Reply

Stack Overflow Problem

I have now tried another crawler, and also tested live web pages, the problem with the special characters in the output still persists. I suggest you also give it a try yourself to understand my problem.

I have also tried to use variations in my Javascript to strip the characters from the output but its of no avail.
by Bhavya Anand on May 4 2009 7:23am Reply

Our Software Stores

IconicoAccurate Design and Development Software

BitsDuJourDiscount Deal Coupons for Windows and Mac Software Apps

Our Software Services

SoftwareMarketingResourceYou Wrote the Code, Now How do you Sell it?

IcoBlogOur Official Blog

© copyright 2004-2018 Iconico, Inc. Code & Design. All Rights Reserved. Terms & Conditions Privacy Policy Terms of Use