Discussion Thread
Data Extractor
Message Thread

Extract any data, including email addresses and URLs from your files and webpages.
Posted in the Data Extractor Forum.
extract text between two keywords.
Hi, My XML file has a content of
bookmark filepath="66286_toc.pdf" page="1" title="Table of Contents" type="GoToR" view="XYZ" view-left="-19" view-top="847" /
I want to extract only 66286_toc.pdf
I am not sure how to write a regex for this. Please help.
extract text between two keywords.
This would depend on exactly the structure of your data but you would probably want to extract numbers followed by an underscore followed by a letters then a dot then "pdf".
You may find this regular expression help file useful in formulating regular expressions:
http://geekswithblogs.net...s/235.aspx
extract text between two keywords.
Hi
Thanks for your quick reply. I read that but still cannot make it work. Maybe It's just because I don't understand it. Basically, I want to extract all pdf filenames between the bookmark tags in my XML files. Below are what I tried to do.
The first keyword is bookmark filepath=".
The second keyword is " page.
I tried ^{bookmark filepath=}*{" page}$
but it didn't work. Any thought?
extract text between two keywords.
Try this pattern
filepath="\d*_\w*.pdf"
extract text between two keywords.
Hi,
Did this work? I don't want to hijack the thread or anything but as it's so relevant I want to do something very similar but extracting everything in between the words option value and option.
This is to extract shopping cart info out of a biggish web site so it will need to search through all the sub-folders of the main site folder on my hard drive.
If anyone could help me I would very grateful, I'm a total noob at this coding stuff.
Thanks,
Mark
extract text between two keywords.
It worked in my test case Mark, I can't really say more than that!
extract text between two keywords.
Is it possible to create a pattern where the text in the web page looks like this.
Broker Name : A J Banford
Company Name : The Radford Company
Address:
12308 Ocean Gateway #5
ocean City, Maryland 21842
800-471-1108
530-546-7963 (fax)
Brokers By Name
I have thousands of web pages that all have the same formatted text in the middle. I need the program to copy from "Broker Name:" until "(fax)" including this text and put it into a text file, then skip a line or two, add the text from the next web page.
The result will a text file with all the names and addresses in it. I have another program that will parse the name and address and process them.
qpcth
extract text between two keywords.