Tech

Soup API

Vuokko Vuorinnen

02 Sep 2025 • 3 min read

At the office, there's a catering service that brings soup to the workplace every working day. They have a menu online that tells you which soup it's going to be. So ofcourse we need some way to automate getting this information and displaying it in our MS Teams chat.

Iteration 1: Handcrafting

My first iteration was quick and dirty, grab the page, use grep, awk and sed to filter the stuff I'm interested in.

#!/bin/bash

# You can use translate-shell to translate to your own language
# Example to translate to French:
# sudo apt install translate-shell
# ./soep | trans -b :fr

curl -s -H "Referer: https://partyline.be/_predefined_pages/prijslijst_NL.asp?M_LANGUAGE=NL&M_SITENAME=PARTYLINE" "https://partyline.be/_predefined_pages/prijslijst_NL.asp?M_CATEGORIE=25" | xmllint --format --html --xpath '//div[@class="row no-gutters mt-5"]' - | sed -e 's/<[^>]*>/ /g' | sed -e 's/&#13;//g' | sed '/^[[:space:]]*$/d' | sed -e 's/^[ \t]*//' | sed -e 's/ Koude suggestie.*//' | sed -e 's/ Weekschotel.*//' | sed 's/soep//g' | sed 's/Soep//g' | sed 's/Week//' | sed 's/tem//' | sed 's/[0-9]\+//g' | sed 's/\///g' | sed 's/Kalender//' | sed 's/ \+/ /g' | trans -b :en

Iteration 2: Using ChatGPT to write some better Python code

I ran my code through ChatGPT and asked it to improve upon it and write it in Python. After a while and debugging a bit, this is what we ended up with:

import requests
from bs4 import BeautifulSoup
import urllib3

# Suppress SSL warnings
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

headers = {
    "Referer": "https://partyline.be/_predefined_pages/prijslijst_NL.asp?M_LANGUAGE=NL&M_SITENAME=PARTYLINE"
}
url = "https://partyline.be/_predefined_pages/prijslijst_NL.asp?M_CATEGORIE=25"

response = requests.get(url, headers=headers, verify=False)
response.raise_for_status()

soup = BeautifulSoup(response.text, 'html.parser')
container = soup.find('div', class_='row no-gutters mt-5')

for col in container.find_all('div', class_='col text-center border p-3'):
    strong_tags = col.find_all('strong')
    if len(strong_tags) < 2:
        continue

    day = strong_tags[0].text.strip()

    # Find the <p> tag containing 'Soep' and extract the soup name
    soep_name = None
    for p in col.find_all('p'):
        if 'soep' in p.get_text(strip=True).lower():
            next_br = p.find_next('br')
            if next_br:
                next_sibling = next_br.next_sibling
                if next_sibling and isinstance(next_sibling, str):
                    soep_name = next_sibling.strip()
            break

    if soep_name:
        print(f"{day}: {soep_name}")

Iteration 3: Use ChatGPT to use ChatGPT (aka: chatgptception)

The website owner of the catering business doesn't really like clean HTML, and the HTML output tends to differ week to week, so we need some AI to interpret the website. So let's ship of the snippet that we know contains the HTML (but slightly differs from week to week) to OpenAI and let AI do the heavy lifting of extracting and formatting:

#!/usr/bin/env python3
import os
import requests
from bs4 import BeautifulSoup
import openai

openai.api_key = os.getenv("OPENAI_API_KEY")

url = "https://partyline.be/_predefined_pages/prijslijst_NL.asp?M_CATEGORIE=25"
headers = {
    "Referer": "https://partyline.be/_predefined_pages/prijslijst_NL.asp?M_LANGUAGE=NL&M_SITENAME=PARTYLINE"
}

try:
    response = requests.get(url, headers=headers, verify=False, timeout=10)
    response.raise_for_status()
except requests.exceptions.RequestException as e:
    print(f"Error fetching URL: {e}")
    exit(1)

soup = BeautifulSoup(response.text, "html.parser")
#print(soup.prettify())
content_div = soup.find("div", class_="row no-gutters mt-5")

if not content_div:
    raise RuntimeError("Soup menu div not found in HTML.")

html_snippets = str(content_div)

#print("Extracted HTML snippet for soups:")
#print(html_snippets)

prompt = f"""
This is HTML from a soup menu page. Extract soups grouped by week. Format output as:

Week van 30/6 tem 4/7
Maandag: Bloemkool
Dinsdag: Groene pesto
Woensdag: Selderij
Donderdag: Asperge
Vrijdag: Wortel met linzen

HTML:
{html_snippets}
"""

# --- New API usage ---

try:
    completion = openai.chat.completions.create(
        model="gpt-5-nano",
        messages=[
            {"role": "system", "content": "You are a helpful assistant specialized in extracting structured data from HTML."},
            {"role": "user", "content": prompt},
        ]
    )
    #print("Full OpenAI API response:")
    #print(completion)
    #print("\nExtracted completion:")
    print(completion.choices[0].message.content.strip())
except Exception as e:
    print(f"Error from OpenAI API: {e}")