HTTP Request Language Preference
Date: 2026-02-06 Category: Parsing Parts Completed: 3/4 Language: Python
Problem Summary
Build a content delivery system that determines which languages to serve based on the HTTP Accept-Language header. The problem has 4 progressive parts: exact matching, prefix matching, wildcard support, and quality factor weighting.
Solutions by Part
Part 1: Exact Language Tag Matching
Approach: Parse comma-separated header into list, use a set for O(1) lookup of supported languages, iterate header in order to maintain preference.
def parse_accept_language(accept_header, supported_languages):
result = []
supported_set = set(supported_languages)
accepted_languages = [lang.strip() for lang in accept_header.split(",")]
for language in accepted_languages:
if language in supported_set:
result.append(language)
return resultPart 2: Generic Language Matching
Approach: Added prefix matching - if a tag has no hyphen, use startswith() to match all supported variants. Used a seen set for O(1) dedup. Single pass over header to preserve preference order.
def parse_accept_language(accept_header, supported_languages):
result = []
seen = set()
def parse_header(accept_header):
return [language.strip() for language in accept_header.split(",")]
supported_set = set(supported_languages)
accepted_languages = parse_header(accept_header)
for language in accepted_languages:
if language in supported_set and language not in seen:
seen.add(language)
result.append(language)
elif "-" not in language:
for sup_language in supported_languages:
if sup_language.startswith(language + "-") and sup_language not in seen:
seen.add(sup_language)
result.append(sup_language)
return resultPart 3: Wildcard Support
Approach: Added * handling as another elif branch - when encountered, add all unseen supported languages. Key bug caught: * has no hyphen so it was falling into the prefix branch before the wildcard check.
def parse_accept_language(accept_header, supported_languages):
result = []
seen = set()
def parse_header(accept_header):
return [language.strip() for language in accept_header.split(",")]
supported_set = set(supported_languages)
accepted_languages = parse_header(accept_header)
for language in accepted_languages:
if language in supported_set and language not in seen:
seen.add(language)
result.append(language)
elif language == "*":
for sup_language in supported_languages:
if sup_language not in seen:
seen.add(sup_language)
result.append(sup_language)
elif "-" not in language and language != "*":
for sup_language in supported_languages:
if sup_language.startswith(language + "-") and sup_language not in seen:
seen.add(sup_language)
result.append(sup_language)
return resultPart 4: Quality Factor Weighting (Not Attempted)
Parses q-factors (e.g., en-US;q=0.8) and sorts by weight descending. Left for next session.
Edge Cases
- Whitespace around language tags after splitting on commas
- Duplicate languages across exact and prefix matches (e.g.,
"fr-FR, fr") - Wildcard
*falling into prefix branch due to no hyphen - Set iteration order being non-deterministic for prefix matches
- Empty header / no matches returning empty list
Bugs & Issues
- Two-pass ordering bug: Initial Part 2 solution used separate loops for exact and prefix matching, which broke preference order when prefixes appeared before exact matches in the header
- Wildcard condition ordering:
*has no hyphen, soelif '-' not in languagecaught it before the wildcard check. Fixed by either reordering conditions or addinglanguage != '*'guard
Key Learnings
- Single-pass preserves order: Doing exact, prefix, and wildcard matching in one loop over the header naturally maintains preference order
startswith(prefix + "-")is cleaner than splitting on hyphen and comparing parts- Separate
seenset fromresultlist gives O(1) dedup while preserving insertion order set(supported_languages)is more idiomatic than set comprehension{x for x in list}- Condition ordering matters in elif chains - more specific checks should come first
Code Quality Notes
- Extracted
parse_headeras a helper early, anticipating parsing complexity in Part 4 - Variable naming improved from
accepted_langauges(typo) toaccepted_languages - Debug
print()statements should be removed before moving on - Could iterate list instead of set in inner loops for deterministic ordering
Q&A Highlights
- Q: Is there a strip/trim method in Python? A: Yes -
.strip()which was already being used. Also.lstrip()and.rstrip()for directional trimming. - Q: Should parsing be done in one pass or two? A: One pass over header tags, resolving all matches (exact, prefix, wildcard) per tag, preserves preference order naturally.
- Q: Can input have multiple language codes? A: Yes, comma-separated. Individual tags follow
language-REGIONformat.