Skip to content

Add slimmed-down Selenium project code #596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
56c2360
Add Selenium project code
martin-martin Oct 14, 2024
eb7db86
Merge branch 'master' into python-selenium
bzaczynski Oct 23, 2024
648edc9
Apply Technical Review suggestions
martin-martin Oct 28, 2024
32b9670
Merge branch 'python-selenium' of https://github.com/realpython/mater…
martin-martin Oct 28, 2024
79a06fc
Merge branch 'master' into python-selenium
martin-martin Oct 28, 2024
a0fd072
Restructures project to work with /discover page
martin-martin Mar 12, 2025
ae7cdf5
Re-apply old TR feedback
martin-martin Mar 13, 2025
07280d0
Fix linter error
martin-martin Mar 13, 2025
a0a3391
Fix black formatting
martin-martin Mar 13, 2025
d17eb80
Merge branch 'master' into python-selenium
martin-martin Mar 13, 2025
21d505c
Apply TR feedback
martin-martin Mar 16, 2025
b6b18e8
Fix double-exit loop
martin-martin Mar 16, 2025
37a10cf
Refix ruff to black
martin-martin Mar 16, 2025
ee77eb2
Merge branch 'master' into python-selenium
martin-martin Mar 16, 2025
446dec2
Update project code, add training files
martin-martin Apr 2, 2025
67f6d93
Format black
martin-martin Apr 2, 2025
4605c10
Merge branch 'master' into python-selenium
martin-martin Apr 2, 2025
79b9b8a
Rename modules, improve CLI output, rename entry point
martin-martin Apr 4, 2025
ef6919a
blacken
martin-martin Apr 4, 2025
3f35024
Merge branch 'python-selenium' of https://github.com/realpython/mater…
martin-martin Apr 4, 2025
bf5326a
Add implicit wait
martin-martin Apr 4, 2025
4d0304f
Merge branch 'master' into python-selenium
bzaczynski Apr 7, 2025
7f22ab0
Add package info
martin-martin Apr 8, 2025
d80539b
Merge branch 'python-selenium' of https://github.com/realpython/mater…
martin-martin Apr 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions python-selenium/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Modern Web Automation With Python and Selenium

This repository contains the module `bandcamp`, which is the sample app built in the Real Python tutorial [Modern Web Automation With Python and Selenium](https://realpython.com/modern-web-automation-with-python-and-selenium/).

## Installation and Setup

Create and activate a [Python virtual environment](https://realpython.com/python-virtual-environments-a-primer/).

Then, install the requirements:

```sh
(venv) $ python -m pip install -r requirements.txt
```

The only direct dependency for this project is [Selenium](https://selenium-python.readthedocs.io/). You should use a Python version of at least 3.10, which is necessary to support [structural pattern matching](https://realpython.com/structural-pattern-matching/).

You'll need a [Firefox Selenium driver](https://selenium-python.readthedocs.io/installation.html#drivers) called `geckodriver` to run the project as-is. Make sure to [download and install](https://github.com/mozilla/geckodriver/releases) it before running the project.

## Run the Bandcamp Discover Player

To run the music player, install the package, then use the entry point command from your command-line:

```sh
(venv) $ python -m pip install .
(venv) $ bandcamp-player
```

You'll see a text-based user interface that allows you to interact with the music player:

```
Type: play [<track_number>] | pause | tracks | more | exit
>
```

Type one of the available commands to interact with Bandcamp's Discover section through your headless browser. Listen to songs with `play`, pause the current song with `pause` and restart it with `play`. List available tracks with `tracks`, and load more songs using `more`. You can exit the music player by typing `exit`.

## Troubleshooting

If the music player seems to hang when you run the script, confirm whether you've correctly set up your WebDriver based on the following points.

### Version Compatibility

Confirm that your browser and corresponding WebDriver are in sync. If you followed the previous suggestion, then you should be using Firefox and geckodriver. If that doesn't work for any reason, you may need to switch browser _and_ WebDriver.

For example, if you're using Chrome, then you need to install ChromeDriver and it must match your Chrome version. Otherwise, you may run into errors like `SessionNotCreatedException`.
For more details, refer to the official [ChromeDriver documentation](https://sites.google.com/chromium.org/driver/) or [geckodriver releases](https://github.com/mozilla/geckodriver/releases).

### Driver Installation and Path Issues

Once you've confirmed that your browser and driver match, make sure that the WebDriver executable is properly installed:

- **Path Configuration:** The driver must be in your system's PATH, or you need to specify its full path in your code.
- **Permissions:** Ensure the driver file has the necessary execution permissions.

If you're still running into issues executing the script, then consult the [Selenium Documentation](https://www.selenium.dev/documentation/) for additional troubleshooting tips or leave a comment on the tutorial.

## About the Authors

Martin Breuss - Email: martin@realpython.com
Bartosz Zaczyński - Email: bartosz@realpython.com

## License

Distributed under the MIT license.
24 changes: 24 additions & 0 deletions python-selenium/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "pycamp"
version = "0.1.1"
requires-python = ">=3.10"
description = "A CLI music player for Bandcamp's Discover page using Python and Selenium"
license = "MIT"
classifiers = [
"Programming Language :: Python :: 3",
"Operating System :: OS Independent"
]
authors = [
{ name = "Martin Breuss", email = "martin@realpython.com" },
{ name = "Bartosz Zaczyński", email = "bartosz@realpython.com" },
]
dependencies = [
"selenium",
]
[project.scripts]
discover = "bandcamp.__main__:main"
pycamp = "bandcamp.__main__:main"
15 changes: 15 additions & 0 deletions python-selenium/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
attrs==25.2.0
certifi==2025.1.31
h11==0.14.0
idna==3.10
outcome==1.3.0.post0
pysocks==1.7.1
selenium==4.29.0
sniffio==1.3.1
sortedcontainers==2.4.0
trio==0.29.0
trio-websocket==0.12.2
typing-extensions==4.12.2
urllib3==2.3.0
websocket-client==1.8.0
wsproto==1.2.0
Empty file.
6 changes: 6 additions & 0 deletions python-selenium/src/bandcamp/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from bandcamp.app.tui import interact


def main():
"""Provide the main entry point for the app."""
interact()
Empty file.
43 changes: 43 additions & 0 deletions python-selenium/src/bandcamp/app/player.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options

from bandcamp.web.pages import DiscoverPage

BANDCAMP_DISCOVER_URL = "https://bandcamp.com/discover/"


class Player:
"""Play tracks from Bandcamp's Discover page."""

def __init__(self) -> None:
self._driver = self._set_up_driver()
self.page = DiscoverPage(self._driver)
self.tracklist = self.page.discover_tracklist
self._current_track = self.tracklist.available_tracks[0]

def __enter__(self):
return self

def __exit__(self, exc_type, exc_value, exc_tb):
"""Close the headless browser."""
self._driver.quit()

def play(self, track_number=None):
"""Play the first track, or one of the available numbered tracks."""
if track_number:
self._current_track = self.tracklist.available_tracks[
track_number - 1
]
self._current_track.play()

def pause(self):
"""Pause the current track."""
self._current_track.pause()

def _set_up_driver(self):
"""Create a headless browser pointing to Bandcamp."""
options = Options()
options.add_argument("--headless")
browser = Firefox(options=options)
browser.get(BANDCAMP_DISCOVER_URL)
return browser
77 changes: 77 additions & 0 deletions python-selenium/src/bandcamp/app/tui.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
from bandcamp.app.player import Player

COLUMN_WIDTH = CW = 30
MAX_TRACKS = 100 # Allows to load more tracks once.


def interact():
"""Control the player through user interactions."""
with Player() as player:
while True:
print(
"\nType: play [<track number>] | pause | tracks | more | exit"
)
match input("> ").strip().lower().split():
case ["play"]:
play(player)
case ["play", track]:
try:
track_number = int(track)
play(player, track_number)
except ValueError:
print("Please provide a valid track number.")
case ["pause"]:
pause(player)
case ["tracks"]:
display_tracks(player)
case ["more"] if (
len(player.tracklist.available_tracks) >= MAX_TRACKS
):
print(
"Can't load more tracks. Pick one from the track list."
)
case ["more"]:
player.tracklist.load_more()
display_tracks(player)
case ["exit"]:
print("Exiting the player...")
break
case _:
print("Unknown command. Try again.")


def play(player, track_number=None):
"""Play a track and show info about the track."""
try:
player.play(track_number)
print(player._current_track._get_track_info())
except IndexError:
print(
"Please provide a valid track number. "
"You can list available tracks with `tracks`."
)


def pause(player):
"""Pause the current track."""
player.pause()


def display_tracks(player):
"""Display information about the currently playable tracks."""
header = f"{'#':<5} {'Album':<{CW}} {'Artist':<{CW}} {'Genre':<{CW}}"
print(header)
print("-" * 80)
for track_number, track_element in enumerate(
player.tracklist.available_tracks, start=1
):
track = track_element._get_track_info()
album = _truncate(track.album, CW)
artist = _truncate(track.artist, CW)
genre = _truncate(track.genre, CW)
print(f"{track_number:<5} {album:<{CW}} {artist:<{CW}} {genre:<{CW}}")


def _truncate(text, width):
"""Truncate track information."""
return text[: width - 3] + "..." if len(text) > width else text
Empty file.
34 changes: 34 additions & 0 deletions python-selenium/src/bandcamp/web/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
from dataclasses import dataclass
from pprint import pformat

from selenium.webdriver.remote.webdriver import WebDriver
from selenium.webdriver.remote.webelement import WebElement
from selenium.webdriver.support.wait import WebDriverWait

MAX_WAIT_SECONDS = 10.0
DEFAULT_WINDOW_SIZE = (1920, 3000)


@dataclass
class Track:
album: str
artist: str
genre: str
url: str

def __str__(self):
return pformat(self)


class WebPage:
def __init__(self, driver: WebDriver) -> None:
self._driver = driver
self._driver.set_window_size(*DEFAULT_WINDOW_SIZE)
self._driver.implicitly_wait(5)
self._wait = WebDriverWait(driver, MAX_WAIT_SECONDS)


class WebComponent(WebPage):
def __init__(self, parent: WebElement, driver: WebDriver) -> None:
super().__init__(driver)
self._parent = parent
90 changes: 90 additions & 0 deletions python-selenium/src/bandcamp/web/elements.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.remote.webdriver import WebDriver
from selenium.webdriver.remote.webelement import WebElement
from selenium.webdriver.support import expected_conditions as EC

from bandcamp.web.base import Track, WebComponent
from bandcamp.web.locators import TrackListLocator, TrackLocator


class TrackListElement(WebComponent):
"""Model the track list on Bandcamp's Discover page."""

def __init__(self, parent: WebElement, driver: WebDriver = None) -> None:
super().__init__(parent, driver)
self.available_tracks = self._get_available_tracks()

def load_more(self) -> None:
"""Load additional tracks."""
view_more_button = self._driver.find_element(
*TrackListLocator.PAGINATION_BUTTON
)
view_more_button.click()
# The button is disabled until all new tracks are loaded.
self._wait.until(
EC.element_to_be_clickable(TrackListLocator.PAGINATION_BUTTON)
)
self.available_tracks = self._get_available_tracks()

def _get_available_tracks(self) -> list:
"""Find all currently available tracks."""
self._wait.until(
self._track_text_loaded,
message="Timeout waiting for track text to load",
)

all_tracks = self._driver.find_elements(*TrackListLocator.ITEM)

# Filter tracks that are displayed and have text.
return [
TrackElement(track, self._driver)
for track in all_tracks
if track.is_displayed() and track.text.strip()
]

def _track_text_loaded(self, driver):
"""Check if the track text has loaded."""
return any(
e.is_displayed() and e.text.strip()
for e in driver.find_elements(*TrackListLocator.ITEM)
)


class TrackElement(WebComponent):
"""Model a playable track on Bandcamp's Discover page."""

def play(self) -> None:
"""Play the track."""
if not self.is_playing:
self._get_play_button().click()

def pause(self) -> None:
"""Pause the track."""
if self.is_playing:
self._get_play_button().click()

@property
def is_playing(self) -> bool:
return "Pause" in self._get_play_button().get_attribute("aria-label")

def _get_play_button(self):
return self._parent.find_element(*TrackLocator.PLAY_BUTTON)

def _get_track_info(self) -> Track:
"""Create a representation of the track's relevant information."""
full_url = self._parent.find_element(*TrackLocator.URL).get_attribute(
"href"
)
# Cut off the referrer query parameter
clean_url = full_url.split("?")[0] if full_url else ""
# Some tracks don't have a genre
try:
genre = self._parent.find_element(*TrackLocator.GENRE).text
except NoSuchElementException:
genre = ""
return Track(
album=self._parent.find_element(*TrackLocator.ALBUM).text,
artist=self._parent.find_element(*TrackLocator.ARTIST).text,
genre=genre,
url=clean_url,
)
22 changes: 22 additions & 0 deletions python-selenium/src/bandcamp/web/locators.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
from selenium.webdriver.common.by import By


class DiscoverPageLocator:
DISCOVER_RESULTS = (By.CLASS_NAME, "results-grid")
COOKIE_ACCEPT_NECESSARY = (
By.CSS_SELECTOR,
"#cookie-control-dialog button.g-button.outline",
)


class TrackListLocator:
ITEM = (By.CLASS_NAME, "results-grid-item")
PAGINATION_BUTTON = (By.ID, "view-more")


class TrackLocator:
PLAY_BUTTON = (By.CSS_SELECTOR, "button.play-pause-button")
URL = (By.CSS_SELECTOR, "div.meta p a")
ALBUM = (By.CSS_SELECTOR, "div.meta p a strong")
GENRE = (By.CSS_SELECTOR, "div.meta p.genre")
ARTIST = (By.CSS_SELECTOR, "div.meta p a span")
Loading