Tuesday 19 August 2014

Scraping Data from Table Python


I'm trying to scrape data from a website's table using Python.

from bs4 import BeautifulSoup
from mechanize import Browser

BASE_URL = "http://www.ggp.com/properties/mall-directory"

def main():
    mech = Browser()
    url = "http://www.ggp.com/properties/mall-directory"
    page1 = mech.open(url)
    html1 = page1.read()
    soup1 = BeautifulSoup(html1)
    extract(soup1, 2007)


def extract(soup,year):
    table = soup.find("table")
    for row in table.findAll('option'):
        print row


main()

Row prints out:

<option value="184">Yakima, WA</option>
<option value="896">Yankton, SD</option>
<option value="851">Yazoo City, MS</option>
<option value="113">York-Hanover, PA</option>
<option value="87">Youngstown-Warren, OH-PA</option>
<option value="235">Yuba City, CA</option>
<option value="205">Yuma, AZ</option>
<option value="424">Zanesville, OH</option>

But what I need is

Yakima, WA
Yankton, SD
Yazoo City, MS
York-Hanover, PA
etc...

I've tried row.findAll('option value') but this doesn't work...



Source: http://stackoverflow.com/questions/24124291/scraping-data-from-table-python

No comments:

Post a Comment