Discussion Thread
Data Extractor
Message Thread

Extract any data, including email addresses and URLs from your files and webpages.
Posted in the Data Extractor Forum.
Data Extraction Rules
Two questions:
1. Is there a way to combine several Extraction Rules, i.e. Email Address, Phone, address, etc. into a single Rule?
2. Are there any additional rules available for extracting complete contact information, i.e. first name, last name, company, address, city, state, xip, telephone, email, URL, etc.?
Thanks,
David
Data Extraction Rules
David,
The individual rules for email and phone extraction are done through regular expressions, which are quick and easy to do. In version 3.0 of the Data Extractor we added Javascript based rules precisly so that you can accomplish tasks like extracting address information.
The Javascript rules allow you to define what columns appear on the extraction grid. Creating these rules do require that you have some knowledge of writing javascript for web pages and are familiar with the document object model (DOM). The approach that you should take is to loop over the table rows and extract the relevent information. As every web page is sructured differently it's hard to write a general rule to accomplish this task.
There's more details in the help page on this website, and we also offer a rule creation service if you would like us to create the rules for you.
Data Extraction Rules
Can you provide a cost $ for extraction of a particular site for me?
Data Extraction Rules
I'm glad that you're interested. We determine cost based on the difficulty of the rule. To determine this we need from you the following pieces of information:
1: The URL of the page you want to extract from. If there are multiple pages with different structures to them then please send all the URLs.
2: The exact column headings that you wish to appear on the extraction grid.
3: One row of sample data for the extraction grid, so that there is no possibility of confusion.
Usually rules cost a hundered dollars and up to create. I think it would be best if you reply to this off-list by clicking the email link at the header of this post.