https://www.akzonobel.com/nl/careers/vacatures/ 웹 사이트에서 채용 정보를보고 스크래핑하고 싶습니다. 국가는 "The Netherlands"이어야하고 직업 수준은 "Entry level"이어야합니다.POST 요청으로 양식 데이터를 보내는 방법은 무엇입니까?
httparty을 사용하여 POST 요청을 보내지 만 초기 10 개의 작업 목록을 계속 반환합니다. 올바른 응답은 3 개의 구인 목록이어야합니다.
require 'httparty'
require 'nokogiri'
@base_url = 'https://www.akzonobel.com'
url = "#{@base_url}/careers/vacatures/"
data = {
'ctl00$contentLeft$ctl01$ddlCountryExt' => 'NLD',
'ctl00$contentLeft$ctl01$ddlJobLevelExt' => 'ENTRY_LEVEL'
}
response = HTTParty.post("#{@base_url}/nl/careers/vacatures/", :body => data)
html = Nokogiri::HTML(response)
jobs = html.xpath('//h3//a')
jobs.each do |job|
puts job.text
end
puts jobs.size
반환 :
Regional Demand Planner Nordeuropa (m,w)
Forecast Analyst - TiO2 Spend Area
PS Regional Manager APAC
Production leader
Engineering Administrator - Temporary
Procurement Manager EMEA
Business Analyst, Americas
HR Business Partner Supply Chain and R&D
AS Regional Manager
Business Information Manager
10
가 어떻게 올바른 응답을 얻을 수있는 사이트에 필요한 양식 데이터를 보낼 수 있습니다
이것은 내가 사용하고 코드는?
업데이트 : 나는 다음과 같은 노력했습니다
:
require 'httparty'
require 'nokogiri'
@base_url = 'https://www.akzonobel.com'
url = "#{@base_url}/careers/vacatures/"
data = {
'ctl00$contentLeft$ctl01$ddlCountryExt' => 'NLD',
'ctl00$contentLeft$ctl01$ddlJobLevelExt' => 'ENTRY_LEVEL',
'ctl00$contentLeft$ctl01$ddlContinentExt' => 1,
'ctl00$contentLeft$ctl01$ddlRegionEx' => 4,
'ctl00$contentLeft$ctl01$ddlJobFamilyEx' => 45,
'ctl00$contentLeft$ctl01$ddlBusinessUnitExt' => 22,
'ctl00$contentLeft$ctl01$ddlJobLevelExt' => 1,
'ctl00$contentLeft$ctl01$ddlCountryExt' => 1,
}
response = HTTParty.post("#{@base_url}/nl/careers/vacatures/", :body => data)
html = Nokogiri::HTML(response)
jobs = html.xpath('//h3//a')
jobs.each do |job|
puts job.text
end
puts jobs.size
불행히도 결과는 정확히 동일합니다.
업데이트 2 :
require 'httparty'
require 'nokogiri'
@base_url = 'https://www.akzonobel.com'
url = "#{@base_url}/careers/vacatures/"
data = {
'contentLeft_ctl01_ddlContinentExt' => 'C_EUROPE',
'contentLeft_ctl01_ddlCountryExt' => 'NLD',
'contentLeft_ctl01_ddlRegionExt' => 'Gelderland',
'contentLeft_ctl01_ddlRegionExt' => 'Limburg',
'contentLeft_ctl01_ddlRegionExt' => 'North Holland',
'contentLeft_ctl01_ddlRegionExt' => 'South Holland',
'contentLeft_ctl01_ddlJobFamilyExt' => 'General Management',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Integrated Supply Chain',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Sales & Marketing',
'contentLeft_ctl01_ddlJobFamilyExt' => 'RD&I',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Support',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Other',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Lvl2_General Management',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Manufacturing',
'contentLeft_ctl01_ddlJobFamilyExt' => 'HSE',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Engineering',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Procurement',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Distribution & Logistics',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Sales',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Marketing',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Lvl2_RD&I',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Finance',
'contentLeft_ctl01_ddlJobFamilyExt' => 'IM',
'contentLeft_ctl01_ddlJobFamilyExt' => 'HR',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Legal, IP & Compliance',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Facilities',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Lvl2_Other',
'contentLeft_ctl01_ddlJobFamilyExt' => '80200000',
'contentLeft_ctl01_ddlJobFamilyExt' => '80300000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81900000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81100000',
'contentLeft_ctl01_ddlJobFamilyExt' => '82000000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81200000',
'contentLeft_ctl01_ddlJobFamilyExt' => '80700000',
'contentLeft_ctl01_ddlJobFamilyExt' => '80400000',
'contentLeft_ctl01_ddlJobFamilyExt' => '80500000',
'contentLeft_ctl01_ddlJobFamilyExt' => '80800000',
'contentLeft_ctl01_ddlJobFamilyExt' => '80900000',
'contentLeft_ctl01_ddlJobFamilyExt' => '82100000',
'contentLeft_ctl01_ddlJobFamilyExt' => '82200000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81010000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81020000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81030000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81040000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81300000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81410000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81420000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81430000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81600000',
'contentLeft_ctl01_ddlJobFamilyExt' => '81700000',
'contentLeft_ctl01_ddlJobFamilyExt' => 'Lvl3_Other',
'contentLeft_ctl01_ddlBusinessUnitExt' => '52000100',
'contentLeft_ctl01_ddlBusinessUnitExt' => '52000200',
'contentLeft_ctl01_ddlBusinessUnitExt' => '52000300',
'contentLeft_ctl01_ddlBusinessUnitExt' => '52000900',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000010',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000013',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000020',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000022',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000026',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000033',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000038',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000041',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000054',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000055',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000056',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000061',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000063',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000100',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000300',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000900',
'contentLeft_ctl01_ddlBusinessUnitExt' => '53000901',
'contentLeft_ctl01_ddlBusinessUnitExt' => '51000000',
'contentLeft_ctl01_ddlJobLevelExt' => 'ENTRY_LEVEL'
}
response = HTTParty.post("#{@base_url}/nl/careers/vacatures/", :body => data)
html = Nokogiri::HTML(response)
jobs = html.xpath('//h3//a')
jobs.each do |job|
puts job.text
end
puts jobs.size
이전과 저에게 동일한 결과를주기 :
다음은 업데이트 된 코드입니다.
HTTParty 스크래핑 이러한 유형의 적절한 수단이 아니다. JavaScript 실행이 필요하지 않으면 Mechanize를 사용합니다. –