  • parsing
  • selenium
  • beautifulsoup
  • bs4
  • 2016-06-14 4 views 2 likes 

    얘들 아 아마존의 주요 검색 창에는 다음과 같은 정보를(구문 분석)

    <input type="submit" class="nav-input" value="Go" tabindex="7"> 

    을 가지고 있는데이 기능을 만드는 생각했다 그래서 이 태그를 발견하고 내가

    이 할 줄 특정 키워드를 검색 할 살아야이 :

    MYT 생각이
    path = '' 
    values = {'s': 'what-I-want-to-search', 
    data = urllib.parse.urlencode(values) 
    data = data.encode('utf-8') 
    driver = webdriver.PhantomJS() 
    html = driver.page_source 
    driver = webdriver.PhantomJS() 
    driver.get(path, data) 
    html = driver.page_source 

    하는 sentdex TUTO 다음 rial, 나는 검색 조건을 인코딩 한 다음 html 경로로 보내고, ive는 동적으로로드 된 웹 페이지와 싸우기 위해 셀레늄을 사용하고 있지만이 경우에는 괜찮을 것이라고 생각하지만 어느 쪽이든, 파이썬을 얻는 방법을 알아야합니다. 기본 사이트에서 무언가를 검색하고 검색 결과 페이지로 이동하게하려면 어떤 도움이 필요합니까?




    요청BS4를 사용하여 당신의 방법 친구 올 것이다, 당신은 당신이 크롬 개발 도구에서 네트워크 탭을 보면 당신이 볼 수있는 올바른 PARAMS 전달해야합니다

    enter image description here

    In [4]: from bs4 import BeautifulSoup  
    In [5]: import requests  
    In [6]: params = {"url": "search-alias=", 
        ...:   "field-keywords": "python"} 
    In [7]: with requests.Session() as s: 
        ...:   url = "" 
        ...:   r = s.get(url, params=params) 
        ...:   soup = BeautifulSoup(r.content,"lxml") 
        ...:   for a in cont: 
        ...:    print(a.select_one("a")["title"]) 
    Python Programming for the Absolute Beginner 
    Python: The Ultimate Beginner's Guide! 
    Automate the Boring Stuff with Python: Practical Programming for Total Beginners 
    Python: Learn Python in One Day and Learn It Well. Python for Beginners with Hands-on Project. (Learn Coding Fast with Hands-On Project Book 1) 
    Python Crash Course: A Hands-On, Project-Based Introduction to Programming 
    Learning Python 
    Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython 
    Python Cookbook 
    Python for Informatics: Exploring Information 
    Fluent Python 
    Python Playground: Geeky Projects for the Curious Programmer 
    Python in easy steps 
    Learn Python the Hard Way: A Very Simple Introduction to the Terrifyingly Beautiful World of Computers and Code (Zed Shaw's Hard Way) 
    Python: The Ultimate Beginners Guide: Start Coding Today 
    Programming the Raspberry Pi, Second Edition: Getting Started with Python 
    Data Science from Scratch: First Principles with Python 

    기능에 코드를 깨는 모든 t 얻을 아이디 pagnNextLink와 앵커가 보이지 않을 때까지 그 페이지는 우리는 단지 루프 유지해야합니다

    from bs4 import BeautifulSoup 
    import requests 
    from urlparse import urljoin 
    # from urllib.parse import urljoin -> python 3 
    def parse(soup): 
        yield [a["title"] for a in"a.a-link-normal.s-access-detail-page.a-text-normal")] 
    def get(term): 
        params = {"url": "search-alias=", 
           "field-keywords": term} 
        with requests.Session() as s: 
         head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36"} 
         url = "" 
         r = s.get(url, params=params) 
         soup = BeautifulSoup(r.content, "lxml") 
         nxt = soup.select_one("#pagnNextLink") 
         while nxt: 
          cont = requests.get(urljoin("", nxt["href"]), headers=head) 
          soup = BeautifulSoup(cont.content,"lxml") 
          for t in parse(soup): 
          nxt = soup.select_one("#pagnNextLink") 

    우리는 반복의 몇 가지의 코드를 실행하면 :

    In [5]: get("python") 
    ['Python Machine Learning', 'Effective Python: 59 Specific Ways to Write Better Python (Effective Software Development)', 'Black Hat Python: Python Programming for Hackers and Pentesters', 'Doing Math with Python: Use Programming to Explore Algebra, Statistics, Calculus, and More!', 'Think Python: How to Think Like a Computer Scientist', 'Python Basics, Level 1 (Coding Club) (Coding Club, Level 1)', 'Python for Finance: Analyze Big Financial Data', 'Violent Python: A Cookbook for Hackers, Forensic Analysts, Penetration Testers and Security Engineers', "Python Essential Reference (Developer's Library)", 'Learn Web Scraping With Python In A Day: The Ultimate Crash Course to Learning the Basics of Web Scraping With Python In No Time (Python, Python ... Python Books, Python for Beginners)', 'Programming Python', 'QPython - Python on Android', 'Coding Club Python: Next Steps Level 2', "Python: Programming, Master's Handbook; A TRUE Beginner's Guide! Problem Solving, Code, Data Science, Data Structures & Algorithms (Code like a PRO ... engineering, r programming, iOS development)", 'Python: Complete Crash Course for Becoming an Expert in Python Programming', 'Coding Club Python: Building Big Apps Level 3'] 
    ['High Performance Python: Practical Performant Programming for Humans', '25ft Python No Spill Clean And Fill', 'Learning Python with Raspberry Pi', 'Web Scraping with Python: Collecting Data from the Modern Web', 'Invent Your Own Computer Games with Python, 3rd Edition', 'More Python Programming for the Absolute Beginner', 'Python for Kids: A Playful Introduction to Programming', "Monty Python's Life of Brian", 'Python 3 Object-oriented Programming - Second Edition', 'Introduction to Computation and Programming Using Python', 'Evolution of The Silly Walks T Shirt - Funny TV Ministry - Various Colours and Sizes XS - 3XL', "Hacking Secret Ciphers with Python: A beginner's guide to cryptography and computer programming with Python", 'Monty Python Fluxx', 'MASTER LOCK 8417DPRO Python Cable 1.80 m x 5 mm 2 Keys', "Learn Python: A beginner's guide book to programming python, learning the basics and start coding easily", 'Master Lock Python Disc Cylinder Key Adjustable Braided Steel Cable Lock, 10 x 1800 mm - Black'] 
    In [6]: get("c programming") 
    ['C Programming', 'C# 6.0 in a Nutshell: The Definitive Reference', 'PIC microcontrollers Programming in C with examples', 'C++: The Ultimate Crash Course to Learning the Basics of C++ In No Time (c plus plus, C++ for beginners, programming computer, how to program) (HTML, Javascript, ... Java, C++ Course, C++ Development Book 3)', 'Java: The Best Guide to Master Java Programming Fast (Java for Beginners, Java for Dummies, how to program, java app, java programming): Volume 2 (C Programming, HTML, Javascript)', 'A Book on C.: Programming in C.', "Learn C the Hard Way: Practical Exercises on the Computational Subjects You Keep Avoiding (Like C) (Zed Shaw's Hard Way Series)", 'C++: C++ and Hacking for dummies. A smart way to learn C plus plus and beginners guide to computer hacking: Volume 10 (C Programming, HTML, Javascript, Programming, Coding, CSS, Java, PHP)', 'Introduction to Algorithms', 'Programming: Computer Programming for Beginners: Learn the Basics of Java, SQL & C++ - 2. Edition (Coding, C Programming, Java Programming, SQL Programming, JavaScript, Python, PHP)', '21st Century C: C Tips from the New School', 'C For Dummies', 'Learn C# Programming Training DVD - Tutorial Video', 'GT01-C30R2-6P Programming PLC Cable 2.5M for Mitsubishi Melsec A970', 'Programming In C', 'Get Coding!: Learn HTML, CSS & JavaScript & build a website, app & game'] 
    ['Hewlett Packard [HP] Calculator Financial Platinum RPN Algebraic Programmable Ref HP12C PLATINUM', 'C: Easy C Programming for Beginners, Your Step-By-Step Guide To Learning C Programming (C Programming Series)', '4.9M RS232 DB9 F/M PLC Programming Cable Adapter White for Omron CQM1 C200HE HG', 'KOREAN COSMETICS, LG Household & Health Care_ SUM37, Secret Programming Eye C...', 'C++: C++ and Python. C++ for Beginners and Python for Dummies to Learn Fast (C Programming, Programming for beginners, c plus plus, programming ... Developers, Coding, CSS, Java, PHP)', '1:8 Brushless Combo BLC-150C Plus + Ripper 2000KV motor + programming Board', 'Lonely Planet Italian Phrasebook & Audio', 'Full Forgiveness - Let Go of Hurt & Offense With Guided Imagery, Self Hypnosis and Neuro-linguistic Programming (NLP)', 'Accelerated C++: Practical Programming by Example (C++ in Depth Series)', 'Gardena Water Computer C1060plus 1864-20', 'Learning To Build Apps For iPhone and iPad - Training DVD', 'Practical C Programming (A Nutshell handbook)', 'Prince Brat and the Whipping Boy', 'English: Practice Test Papers (Letts Key Stage 2 Success) (Letts Key Stage 1 Success)', 'Arabic For Dummies: Audio Set', 'The Actor and the Text (Applause Acting Series)'] 

    당신이 할 수있는 무엇 당신은 구문 분석을 좋아합니다, 나는 우리가 올바른 데이터를 얻고 있는지 쉽게 알 수 있도록 제목을 가져 왔습니다. 나는 또한 요청 사이에 잠을 추가하는 것을 고려할 것이다.


    개봉 된 사람, 놀라운 사람 감사합니다. – entercaspa

    관련 문제