python


The code worked and then printed “index out of range”? - Python


I tried to apply scraping codes to get the prices of a component distribution website. After working with html script tag, I was able to print out one complete symbol into csv file. Then, I tried to apply the code for all component price, that I need to do.
However, It worked well for the first 5 items into csv file.
And then, it printed out IndexError: list index out of range with an error in the codeline:
f.write('{},{},{}\n'.format(price[0], data_2['part_available'],
data_2['pn_sku']))
I tried to delete the "part_available" and "pn_sku", it then printed out
"IndexError: tuple index out of range"
I didn't list any index with 5, so I just don't understand why it worked for only first 5 items?
import yaml
import re
import xlrd
from urllib import urlopen as uReq
from bs4 import BeautifulSoup as soup
book = xlrd.open_workbook("SKU_ID_data.xlsx")
# List sheet names, and pull a sheet by name
sheet_names = book.sheet_names()
xl_sheet = book.sheet_by_name(sheet_names[0])
xl_sheet = book.sheet_by_index(0)
# Number of columns
row = xl_sheet.row(0)
num_rows = xl_sheet.nrows
i = 0
while i < num_rows:
symbolslist = xl_sheet.cell(i,0).value
my_url = "https://www.digikey.com/products/en?keywords="+ symbolslist +""
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
containers = page_soup.body.script
filename = "products.csv"
data = containers.text
#get the price
regex = '<span itemprop="price" id="schema-offer" >(.+?)</span>'
pattern =re.compile(regex)
price = re.findall(pattern, page_html)
#get the available quantity and SKU_id
data_2 = yaml.load(containers.text.replace('var utag_data =', '', 1))
with open('products.csv', 'a') as f:
f.write('{},{},{}\n'.format(price[0], data_2['part_available'],
data_2['pn_sku']))
i+=1
f.close()
Please leave your comments.
Here is the SKU_ID_data
399-1168-1-ND
296-1232-5-ND
631-1073-1-ND
RJHSE-5084-ND
P220FCT-ND
497-1222-1-ND
...so on with 65 items in total.
Here is the result in the csv product file.
Price | Part_Available | SKU_id
0.10 1123291 399-1168-1-ND
0.73 2382 296-1232-5-ND
2.34 10815 631-1073-1-ND
1.26 3159 RJHSE-5084-ND
0.10 48763 P220FCT-ND
And the error stack:
runfile('E:/Dropbox/Scrapingweb_test1.py', wdir='E:/Dropbox')
Traceback (most recent call last):
File "<ipython-input-103-dbcfaeb354a5>", line 1, in <module>
runfile('E:/Dropbox/Scrapingweb_test1.py', wdir='E:/Dropbox')
File "C:\ProgramData\Anaconda2\lib\site-
packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "C:\ProgramData\Anaconda2\lib\site-
packages\spyder\utils\site\sitecustomize.py", line 87, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "E:/Dropbox/Scrapingweb_test1.py", line 43, in <module>
f.write(price[0])
IndexError: list index out of range

Related Links

How to convert the comma separated values into index format from the List output in python [on hold]
matplotlib multiple charts. wrong or bad apperance
how to use Numba target='parallel' to check if a 2D array exists in a list which comprises of multiple 2D arrays
python csv fieldnames error
OpenCV : Vehicle axle detection
How to convert HTML to text keeping underline tags (<u></u>) using html2text
How to override settings.py with settings_local.py in Django 1.11
Format the color of a cell in a panda dataframe according to multiple conditions
Bokeh server callback initiated from Flask application
Python Variable Amount Of Input
Turn pandas dataframe list into boolean column
How to handle concatenate with empty matrix
python django translation .po and .mo file not translating the files
jupyter not using version set by pyenv
Generalize print+format for a variable number of inputs
What are the differences in these two codes? [closed]

Categories

HOME
python
sidekiq
symfony
semantic-ui
reverse-engineering
nlp
rotation
wsdl
raspbian
ipython
computer-vision
ruby-on-rails-3
snap.svg
apiconnect
codeblocks
aruco
vuex
x11
jacoco
ada
jcl
attask
epicor
nsmenuitem
delete-file
bootstrap-typeahead
yosys
rhandsontable
owl-api
cortex-m3
gpib
getjson
spring-profiles
exponential
receipt
nsurlconnection
mapguide
.net-assembly
inkscape
coreclr
viewgroup
gist
istorage
text-classification
mcrypt
instruction-set
gmt
project-template
master
android-navigationview
cjson
push-diffusion
icefaces
spoofing
boost-hana
gherkin
kendo-combobox
dbscan
console-redirect
cmocka
servlet-3.0
asymptote
getrusage
android-cursoradapter
jta
windows-vista
optionbutton
selendroid
matcaffe
sem
connection-reset
manjaro
harp
google-hadoop
web-component-tester
jai
start-job
pisa
cloudpebble
ogr2ogr
node-imagemagick
author
dvcs
late-static-binding
wic
installshield-2011
ftp4j
nuspec
surveyor-gem
graph-coloring
vertical-scrolling
fieldset
vt100
sequelpro
returnurl
vim-powerline
feof
deploying
shim
galaxy-tab
easygui
qtembedded
authenticode
collect
noir
for-xml-path
rijndael
post-redirect-get
hadoop-plugins
clients
boost-date-time
gears

Resources

Encrypt Message



code
soft
python
ios
c
html
jquery
cloud
mobile