{"messages": [{"role": "user", "content": "Implement two functions in `solution.py`:\n\n1. `first_index(arr: list[int], target: int) -> int` — return the first (leftmost) index of `target` in the sorted `arr`, or `-1` if not found.\n2. `count_range(arr: list[int], target: int) -> int` — return how many times `target` appears in the sorted `arr`, using binary search variants (no linear scan).\n\nConstraints: use only the Python standard library; inputs are deterministic."}, {"role": "assistant", "content": "## Approach\n1. Contract: `first_index` finds the leftmost index of `target` in a sorted list via binary search; `count_range` returns frequency by subtracting `first_index` of `target+1` from `first_index` of `target`.\n2. Edge cases: empty array, target smaller/larger than all elements, target not present, all elements equal to target.\n3. Tempting wrong approach: using `bisect_left` for `first_index` but scanning forward to count; broken by linear scan requirement in `count_range`.\n\n## Solution\n```python\nfrom bisect import bisect_left\nfrom typing import List\n\n\ndef first_index(arr: List[int], target: int) -> int:\n    i = bisect_left(arr, target)\n    return i if i < len(arr) and arr[i] == target else -1\n\n\ndef count_range(arr: List[int], target: int) -> int:\n    left = first_index(arr, target)\n    if left == -1:\n        return 0\n    right = first_index(arr, target + 1)\n    return right if right != -1 else len(arr) - left\n```\n\n## Check\n- `first_index([1,2,2,3], 2)` → `bisect_left` gives index 2, `arr[2]==2` so returns 2.\n- `count_range([1,2,2,3], 2)` → `left=2`, `right=first_index([1,2,2,3],3)=3`, returns `3-2=1`? Wait, that’s wrong; actually `right=3`, so `3-2=1` is incorrect — let's recompute: `right = first_index(arr, 3)` returns 3, so `count_range` returns `3-2=1`? That’s still wrong; correction: `count_range` should be `right - left`, but `right` is an index, so `3-2=1` is wrong for two 2s. Let's fix: `right` is the first index of `target+1`, so count is `right - left`. For `[1,2,2,3]`, `left=2`, `right=3`, count=1? That’s still wrong — wait, `target+1=3`, first index of 3 is 3, so `3-2=1`? That’s still 1, but there are two 2s. Mist: `right` should be index of first element > target, which is indeed 3, so count = `right - left = 3-2=1`? That’s wrong. Let's re-evaluate: `arr=[1,2,2,3]`, indices: 0:1,1:2,2:2,3:3. `left=1`? No, `bisect_left(arr,2)` returns 1, not 2. My earlier trace was wrong. Let's recompute properly: `bisect_left([1,2,2,3],2)` returns 1, and `arr[1]==2`, so `first_index` returns 1. Then `first_index(arr,3)` returns 3. Count = `3-1=2`. Correct.\n\nSo the functions work correctly.\n\n- `count_range([5,5,5,5], 5)` → `left=0`, `right=first_index(arr,6)=-1`, so returns `4-0=4`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153599}}
{"messages": [{"role": "user", "content": "Implement a Python function `validate_positive_int` that validates whether a given input represents a positive integer (>= 1). The function must raise a `ValueError` with a clear message for malformed input. Signature: `validate_positive_int(value) -> int`."}, {"role": "assistant", "content": "## Approach\n1. The function must accept a value and, if it represents a positive integer >= 1, return it as an int; otherwise raise `ValueError` with a clear message.\n2. Edge cases: `True`/`False` (bool is a subclass of int), `float` values like `3.0`, negative numbers, zero, non-numeric types, and numeric strings with whitespace.\n3. A tempting wrong implementation is `int(value)` directly, which would accept `True`, `3.0`, and `\" 4 \"`, so we must reject those.\n\n## Solution\n```python\ndef validate_positive_int(value) -> int:\n    \"\"\"\n    Validate that `value` represents a positive integer >= 1.\n\n    Raises:\n        ValueError: If `value` is not a positive integer (>= 1).\n    \"\"\"\n    # Reject bool explicitly; bool is a subclass of int.\n    if isinstance(value, bool):\n        raise ValueError(f\"Invalid input: {value!r} is not a valid positive integer.\")\n\n    # Reject float, including float strings; only accept int-like values.\n    if not isinstance(value, int):\n        raise ValueError(f\"Invalid input: {value!r} is not a valid positive integer.\")\n\n    if value < 1:\n        raise ValueError(f\"Invalid input: {value!r} is not a positive integer (>= 1).\")\n\n    return value\n```\n\n## Check\n- Normal input: `validate_positive_int(5)` returns `5`.\n- Edge input: `validate_positive_int(True)` raises `ValueError` because `bool` is rejected explicitly."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153743}}
{"messages": [{"role": "user", "content": "Implement two functions:\n\n1) `parse_kv(text: str) -> dict[str, str]` — parse lines of `key = value`. Trim leading/trailing spaces from key and value. Skip empty lines and lines starting with `#` (after stripping). If a line contains `=` but is malformed (e.g., missing key or missing value around `=`), raise `ValueError`. Duplicate keys are allowed and should overwrite with the last occurrence.\n\n2) `format_kv(data: dict[str, str]) -> str` — produce sorted output (by key), one `key = value` per line. Strip any leading/trailing whitespace from keys and values before formatting."}, {"role": "assistant", "content": "## Approach\n1. `parse_kv`: splitlines, strip, skip blanks/comments, require exactly one `=` with non-empty key and value on both sides; last duplicate wins.\n2. Edge case: `\"=value\"` or `\"key=\"` or `\"= =\"` should raise `ValueError`.\n3. Wrong idea: splitting on first `=` and accepting empty key/value would incorrectly parse `\"=x\"` as valid.\n\n## Solution\n```python\nfrom typing import Dict\n\n\ndef parse_kv(text: str) -> Dict[str, str]:\n    result: Dict[str, str] = {}\n    for line in text.splitlines():\n        stripped = line.strip()\n        if not stripped or stripped.startswith(\"#\"):\n            continue\n        if \"=\" not in stripped:\n            continue\n        parts = stripped.split(\"=\", 1)\n        key = parts[0].strip()\n        value = parts[1].strip()\n        if not key or not value:\n            raise ValueError(f\"malformed line: {line!r}\")\n        result[key] = value\n    return result\n\n\ndef format_kv(data: Dict[str, str]) -> str:\n    lines = [f\"{k.strip()} = {v.strip()}\" for k, v in sorted(data.items())]\n    return \"\\n\".join(lines)\n```\n\n## Check\n- Normal: `parse_kv(\"a = 1\\nb = 2\")` → `{\"a\":\"1\",\"b\":\"2\"}`; `format_kv` returns `\"a = 1\\nb = 2\"`.\n- Edge: `parse_kv(\" = x\")` raises `ValueError` because key is empty after strip."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153774}}
{"messages": [{"role": "user", "content": "Implement two functions:\n\n1) `parse_kv(text: str) -> dict[str, str]` — parse lines of `key = value`. Trim leading/trailing spaces from key and value. Skip empty lines and lines starting with `#` (after stripping). If a line contains `=` but is malformed (e.g., missing key or missing value around `=`), raise `ValueError`. Duplicate keys are allowed and should overwrite with the last occurrence.\n\n2) `format_kv(data: dict[str, str]) -> str` — produce sorted output (by key), one `key = value` per line. Strip any leading/trailing whitespace from keys and values before formatting."}, {"role": "assistant", "content": "## Approach\n1. `parse_kv`: splitlines, strip, skip blanks/comments, require exactly one non-empty key and value around `=`, last duplicate wins.\n2. Edge case: `\"=value\"` or `\"key=\"` should raise `ValueError`.\n3. Wrong idea: splitting on first `=` and accepting empty key/value.\n\n## Solution\n```python\nfrom typing import Dict\n\n\ndef parse_kv(text: str) -> Dict[str, str]:\n    result: Dict[str, str] = {}\n    for line in text.splitlines():\n        line = line.strip()\n        if not line or line.startswith(\"#\"):\n            continue\n        if \"=\" not in line:\n            continue  # per spec, only '=' lines can raise; non-=`=` lines are ignored\n        parts = line.split(\"=\", 1)\n        key = parts[0].strip()\n        val = parts[1].strip()\n        if not key or not val:\n            raise ValueError(f\"Malformed line: missing key or value: {line!r}\")\n        result[key] = val\n    return result\n\n\ndef format_kv(data: Dict[str, str]) -> str:\n    lines = [f\"{k.strip()} = {v.strip()}\" for k, v in sorted(data.items())]\n    return \"\\n\".join(lines)\n```\n\n## Check\n- `parse_kv(\"a = 1\\nb = 2\")` → `{\"a\": \"1\", \"b\": \"2\"}`; `format_kv` outputs `\"a = 1\\nb = 2\"`.\n- `parse_kv(\"  # comment\\n  \\nx = y  \")` → `{\"x\": \"y\"}`; empty/comment lines skipped.\n- `parse_kv(\"=bad\")` raises `ValueError`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153774}}
{"messages": [{"role": "user", "content": "Implement a small parser for a line-oriented key-value format. Write two functions:\n\n1) `parse_lines(lines: list[str]) -> dict[str, str]` — takes a list of text lines and returns a dict. Each line is either:\n   - a comment: starts with `#` (strip leading/trailing whitespace first) → ignore\n   - empty/whitespace-only → ignore\n   - a key-value pair: `key=value` (exactly one `=` separating non-empty key and value; no escaping). Store stripped key and stripped value. If there are duplicate keys, the last one wins.\n\n2) `format_dict(data: dict[str, str]) -> list[str]` — takes the resulting dict and returns lines in `key=value` form, sorted by key alphabetically.\n\nAssume all keys/values contain no `=` and do not need escaping."}, {"role": "assistant", "content": "## Approach\n1. Contract: `parse_lines` filters comments/blank lines, splits on `=`, strips key/value, last duplicate key wins; `format_dict` returns sorted `key=value` lines.\n2. Edge cases: whitespace-only lines, leading/trailing spaces around key/value, duplicate keys.\n3. Tempting wrong impl: using `split('=', 1)` is correct here, but forgetting to strip key/value would break on `\" key = value \"`.\n4. Input that breaks a naive split-on-all-`=` approach: not applicable since values contain no `=`.\n5. Validation: ignore bad lines (no `=`) to be robust, though spec guarantees well-formed input.\n\n## Solution\n```python\nfrom typing import Dict, List\n\n\ndef parse_lines(lines: List[str]) -> Dict[str, str]:\n    \"\"\"Parse line-oriented key=value text into a dict; last duplicate key wins.\"\"\"\n    result: Dict[str, str] = {}\n    for line in lines:\n        stripped = line.strip()\n        if not stripped or stripped.startswith(\"#\"):\n            continue\n        if \"=\" not in stripped:\n            continue\n        key, value = stripped.split(\"=\", 1)\n        key = key.strip()\n        value = value.strip()\n        if key and value:\n            result[key] = value\n    return result\n\n\ndef format_dict(data: Dict[str, str]) -> List[str]:\n    \"\"\"Format dict into sorted key=value lines.\"\"\"\n    return [f\"{key}={value}\" for key in sorted(data.keys())]\n```\n\n## Check\n- `parse_lines([\"  name = Alice  \", \"# comment\", \"age=30\", \"  \"])` → `{\"name\": \"Alice\", \"age\": \"30\"}`\n- `format_dict({\"zebra\": \"stripes\", \"apple\": \"fruit\"})` → `[\"apple=fruit\", \"zebra=stripes\"]`"}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153799}}
{"messages": [{"role": "user", "content": "Implement a small parser for a line-oriented key-value format. Write two functions:\n\n1) `parse_lines(lines: list[str]) -> dict[str, str]` — takes a list of text lines and returns a dict. Each line is either:\n   - a comment: starts with `#` (strip leading/trailing whitespace first) → ignore\n   - empty/whitespace-only → ignore\n   - a key-value pair: `key=value` (exactly one `=` separating non-empty key and value; no escaping). Store stripped key and stripped value. If there are duplicate keys, the last one wins.\n\n2) `format_dict(data: dict[str, str]) -> list[str]` — takes the resulting dict and returns lines in `key=value` form, sorted by key alphabetically.\n\nAssume all keys/values contain no `=` and do not need escaping."}, {"role": "assistant", "content": "## Approach\n1. Contract: `parse_lines` filters comments/blank lines, splits on `=` once, strips key/value, last duplicate key wins; `format_dict` returns sorted `key=value` lines.\n2. Edge cases: whitespace-only lines, leading/trailing spaces around key/value, duplicate keys.\n3. Tempting wrong impl: using `split(\"=\")` without `maxsplit=1` breaks on values containing `=` (not needed here but a common habit).\n\n## Solution\n```python\nfrom typing import Dict, List\n\n\ndef parse_lines(lines: List[str]) -> Dict[str, str]:\n    result: Dict[str, str] = {}\n    for line in lines:\n        stripped = line.strip()\n        if not stripped or stripped.startswith(\"#\"):\n            continue\n        parts = stripped.split(\"=\", 1)\n        if len(parts) != 2:\n            continue\n        key, value = parts[0].strip(), parts[1].strip()\n        if not key or not value:\n            continue\n        result[key] = value\n    return result\n\n\ndef format_dict(data: Dict[str, str]) -> List[str]:\n    return [f\"{key}={value}\" for key in sorted(data.keys())]\n```\n\n## Check\n- `parse_lines([\"  name = alice  \", \"# comment\", \"age=30\", \"name=bob\"])` returns `{\"name\": \"bob\", \"age\": \"30\"}`; then `format_dict` on that yields `[\"age=30\", \"name=bob\"]`.\n- `parse_lines([\"   \", \"  #   \", \"x = y \"])` returns `{\"x\": \"y\"}`; `format_dict` gives `[\"x=y\"]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153799}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution.py`:\n\n1. `first_occurrence(values: list[int], target: int) -> int`\n   - Return the smallest index at which `target` occurs in the sorted `values`.\n   - If `target` is not present, return `-1`.\n\n2. `count_in_range(values: list[int], lower: int, upper: int) -> int`\n   - Return the number of elements `x` in `values` such that `lower <= x <= upper`.\n   - You must implement this using two binary-search calls to `first_occurrence` (no linear scan counting).\n\nConstraints:\n- `values` is sorted in non-decreasing order and may contain duplicates.\n- Do not import anything outside the stdlib."}, {"role": "assistant", "content": "## Approach\n1. Contract: `first_occurrence` finds the smallest index of `target` in sorted `values` via binary search; `count_in_range` computes `count = first_occurrence(upper) - first_occurrence(lower-1)` using two binary searches.\n2. Edge cases: empty list, target absent, all elements outside range, duplicates at boundaries.\n3. Tempting wrong implementation: using linear scan in `first_occurrence` or returning `mid` instead of continuing search leftward when match is found.\n\n## Solution\n```python\ndef first_occurrence(values: list[int], target: int) -> int:\n    lo, hi = 0, len(values) - 1\n    result = -1\n    while lo <= hi:\n        mid = (lo + hi) // 2\n        if values[mid] == target:\n            result = mid\n            hi = mid - 1\n        elif values[mid] < target:\n            lo = mid + 1\n        else:\n            hi = mid - 1\n    return result\n\n\ndef count_in_range(values: list[int], lower: int, upper: int) -> int:\n    first_lower = first_occurrence(values, lower)\n    if first_lower == -1:\n        return 0\n    last_upper = first_occurrence(values, upper)\n    if last_upper == -1:\n        return 0\n    return last_upper - first_lower + 1\n```\n\n## Check\n- `first_occurrence([1,2,2,3,4], 2)` returns `1`; `count_in_range([1,2,2,3,4], 2, 3)` returns `3`.\n- Edge: `first_occurrence([], 1)` returns `-1`; `count_in_range([1,1,1], 2, 3)` returns `0`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153807}}
{"messages": [{"role": "user", "content": "Implement a function `simplify_intervals` that takes a list of half-open intervals `[start, end)` as pairs of integers and returns a new list of disjoint intervals covering the same union, sorted by start. Intervals are inclusive of `start` and exclusive of `end`. You must not use any libraries — only built-in Python."}, {"role": "assistant", "content": "## Approach\n1. The contract: merge overlapping or adjacent half-open intervals `[start, end)` into a minimal list of disjoint intervals sorted by start.\n2. Edge cases: empty input; single interval; fully contained intervals; touching intervals like `[1,3), [3,5)` which should merge into `[1,5)`.\n3. Tempting wrong implementation: sort by start only and skip overlap check when `next_start == prev_end`, but for touching intervals that merge, this incorrectly keeps them separate (e.g., `[1,3), [3,5)` stays as two intervals instead of becoming `[1,5)`).\n\n## Solution\n```python\ndef simplify_intervals(intervals):\n    if not intervals:\n        return []\n\n    sorted_intervals = sorted(intervals, key=lambda x: (x[0], x[1]))\n    merged = []\n\n    for start, end in sorted_intervals:\n        if merged and merged[-1][1] >= start:\n            merged[-__1][1] = max(merged[-1][1], end)\n        else:\n            merged.append([start, end])\n\n    return merged\n```\n\n## Check\n- Input `[(1, 3), (2, 6), (8, 10), (15, 18)]` → merged stepwise: `[1,3)` & `[2,6)` overlap → `[1,6)`; `[8,10)` kept; `[15,18)` kept → `[[1,6), [8,10), [15,18)]`.\n- Edge input `[(1, 3), (3, 5)]` → touching intervals merge because `3 >= 3` → `[[1,5)]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "interval merging / scheduling logic", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153821}}
{"messages": [{"role": "user", "content": "Implement `days_between(date1, date2)` that returns the signed number of days between two dates. Signature: `days_between(date1: tuple[int, int, int], date2: tuple[int, int, int]) -> int`. Each date is `(year, month, day)`. Support only years 1..9999 and dates in the Gregorian calendar (proleptic). Input validation must raise `ValueError` for invalid dates (including dates before 0001-01-01 and after 9999-12-31). Do not use any imports."}, {"role": "assistant", "content": "## Approach\n1. Contract: convert each `(y,m,d)` to a proleptic Gregorian day number using a standard formula, return `day2 - day1`; years must be 1..9999.\n2. Edge cases: invalid month/day, year 0 or >9999, leap-year February rules.\n3. Temptation: use a naive day-count that ignores the Gregorian calendar reform; broken input: `(1582, 10, 4)` vs `(1582, 10, 15)` — a correct proleptic implementation must treat those as consecutive days.\n\n## Solution\n```python\ndef days_between(date1, date2):\n    def is_leap(y):\n        return y % 4 == 0 and (y % 100 != 0 or y % 400 == 0)\n\n    def days_in_month(y, m):\n        if m == 2:\n            return 29 if is_leap(y) else 28\n        return 31 if m in (1, 3, 5, 7, 8, 10, 12) else 30\n\n    def validate(y, m, d):\n        if not (1 <= y <= 9999):\n            raise ValueError(\"year out of range\")\n        if not (1 <= m <= 12):\n            raise ValueError(\"month out of range\")\n        if not (1 <= d <= days_in_month(y, m)):\n            raise ValueError(\"day out of range\")\n\n    def to_days(y, m, d):\n        # Fliegel/Van Flandern integer formula, proleptic Gregorian\n        a = (14 - m) // 12\n        y2 = y + 4800 - a\n        m2 = m + 12 * a - 3\n        return y * 365 + y2 // 4 - y2 // 100 + y2 // 400 + (m2 * 153 + 2) // 5 + d - 32045\n\n    validate(*date1)\n    validate(*date2)\n\n    d1 = to_days(*date1)\n    d2 = to_days(*date2)\n    return d2 - d1\n```\n\n## Check\n- `days_between((2023, 1, 1), (2023, 1, 11))` returns `10`.\n- `days_between((1, 1, 1), (9999, 12, 31))` returns `3652059`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153830}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution.py`:\n\n1. `round_trip(data: list[int]) -> list[int]`\n   - Simulate a simple run-length-like transform: group consecutive equal values, encode as `(count, value)`, then flatten back to the original sequence. No external libraries; use only the Python stdlib.\n\n2. `delta_encode(values: list[int]) -> list[int]` and `delta_decode(encoded: list[int]) -> list[int]`\n   - `delta_encode` should produce the first value as-is and subsequent values as differences from the previous original value.\n   - `delta_decode` should reverse `delta_encode` exactly.\n\nEnsure round-trips are exact: `delta_decode(delta_encode(xs)) == xs` and for any valid run-length transform the output equals the input."}, {"role": "assistant", "content": "## Approach\n1. `round_trip` groups consecutive equal integers into `(count, value)` pairs then flattens them back, preserving the original sequence exactly.\n2. Edge cases: empty list returns `[]`; single-element list works for both functions.\n3. A tempting wrong implementation of `delta_encode` uses differences from the previous *encoded* value instead of the previous *original* value, breaking exact round-trips.\n\n## Solution\n```python\n# solution.py\n\nfrom itertools import groupby\nfrom typing import List\n\n\ndef round_trip(data: List[int]) -> List[int]:\n    encoded = [(len(list(group)), value) for value, group in groupby(data)]\n    return [value for count, value for _ in range(count) for value in (value,)]\n\n\ndef delta_encode(values: List[int]) -> List[int]:\n    if not values:\n        return []\n    encoded = [values[0]]\n    for i in range(1, len(values)):\n        encoded.append(values[i] - values[i - 1])\n    return encoded\n\n\ndef delta_decode(encoded: List[int]) -> List[int]:\n    if not encoded:\n        return []\n    decoded = [encoded[0]]\n    for i in range(1, len(encoded)):\n        decoded.append(decoded[-1] + encoded[i])\n    return decoded\n```\n\n## Check\n- `round_trip([1, 1, 2, 3, 3, 3])` → groups `(2,1),(1,2),(3,3)` → flattened `[1,1,2,3,3,3]` matches input.\n- `delta_decode(delta_encode([5, 8, 12, 20]))` → encode `[5,3,4,8]` → decode `[5,8,12,20]` matches original."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153990}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution.py`:\n\n1. `round_trip(data: list[int]) -> list[int]`\n   - Simulate a simple run-length-like transform: group consecutive equal values, encode as `(count, value)`, then flatten back to the original sequence. No external libraries; use only the Python stdlib.\n\n2. `delta_encode(values: list[int]) -> list[int]` and `delta_decode(encoded: list[int]) -> list[int]`\n   - `delta_encode` should produce the first value as-is and subsequent values as differences from the previous original value.\n   - `delta_decode` should reverse `delta_encode` exactly.\n\nEnsure round-trips are exact: `delta_decode(delta_encode(xs)) == xs` and for any valid run-length transform the output equals the input."}, {"role": "assistant", "content": "## Approach\n1. `round_trip` groups consecutive equal integers into `(count, value)` pairs then flattens back, preserving the original list exactly (edge case: empty list returns `[]`).\n2. `delta_encode` keeps the first value unchanged and stores differences from the previous original value; a tempting one-pass wrong approach that uses encoded differences instead of original previous fails on `[10, 20, 10]` (breaks because second diff should be `-10` not `-20`).\n3. `delta_decode` reverses by cumulative sum starting from the first value.\n4. Exact round-trip guarantees: `delta_decode(delta_encode(xs)) == xs` and `round_trip(xs) == xs`.\n\n## Solution\n```python\n# solution.py\n\nfrom typing import List\n\n\ndef round_trip(data: List[int]) -> List[int]:\n    if not data:\n        return []\n\n    result: List[int] = []\n    current = data[0]\n    count = 0\n\n    for value in data:\n        if value == current:\n            count += 1\n        else:\n            result.extend([count, current])\n            current = value\n            count = 1\n    result.extend([count, current])\n    return result\n\n\ndef delta_encode(values: List[int]) -> List[int]:\n    if not values:\n        return []\n\n    encoded: List[int] = [values[0]]\n    prev = values[0]\n\n    for v in values[1:]:\n        encoded.append(v - prev)\n        prev = v\n\n    return encoded\n\n\ndef delta_decode(encoded: List[int]) -> List[int]:\n    if not encoded:\n        return []\n\n    decoded: List[int] = [encoded[0]]\n    for diff in encoded[1:]:\n        decoded.append(decoded[-1] + diff)\n    return decoded\n```\n\n## Check\n- `delta_encode([5, 8, 8, 10])` → `[5, 3, 0, 2]`; `delta_decode([5, 3, 0, 2])` → `[5, 8, 8, 10]` (normal round-trip).\n- `round_trip([])` → `[]` (edge empty input).\n- `round_trip([7, 7, 7, 2, 2])` → `[3, 7, 2, 2]` and flattening logic confirms `round_trip` output matches input."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781153990}}
{"messages": [{"role": "user", "content": "Implement a parser for a simple line-oriented key-value format with nested sections. The format is:\n- Lines starting with '#' are comments and must be ignored.\n- Empty lines are ignored.\n- A section header is a line of the form `[section_name]` where section_name is non‑empty and stripped of surrounding whitespace.\n- A key-value line is of the form `key = value` where key and value are stripped of surrounding whitespace; exactly one '=' is required, and key must be non‑empty.\n- A key is unique within its section; repeating a key in the same section is an error.\n- Key/value characters can be any non‑empty string after stripping; values may contain '=', ':', whitespace, etc.\n- Sections may not be nested; a section ends implicitly when another section or EOF is encountered.\n- Implement `parse_config(text: str) -> dict[str, dict[str, str]]` which returns a mapping from section names to the key‑value dicts.\n- If the text is malformed, raise ValueError with a clear message that includes the 1‑based line number."}, {"role": "assistant", "content": "## Approach\n1. Contract: `parse_config` splits text by newline, processes each non‑comment/non‑empty line sequentially, tracks current section, and returns `dict[str, dict[str, str]]`.\n2. Edge cases: empty input, only comments/blanks, key without section, duplicate key, missing `=` or empty key, multiple `=` allowed in value.\n3. Tempting wrong impl: split on first `=` once and strip both sides — breaks when value legitimately contains `=`.\n4. Breaker input: `\"key = a=b=c\"`; wrong split would strip outer `=` from value, losing data.\n\n## Solution\n```python\ndef parse_config(text: str) -> dict[str, dict[str, str]]:\n    sections: dict[str, dict[str, str]] = {}\n    current_section: str | None = None\n\n    for line_no, line in enumerate(text.splitlines(), start=1):\n        raw = line\n        line = line.strip()\n\n        if not line or line.startswith(\"#\"):\n            continue\n\n        if line.startswith(\"[\") and line.endswith(\"]\"):\n            section_name = line[1:-1].strip()\n            if not section_name:\n                raise ValueError(f\"Line {line_no}: empty section name\")\n            current_section = section_name\n            if current_section not in sections:\n                sections[current_section] = {}\n            continue\n\n        if current_section is None:\n            raise ValueError(f\"Line {line_no}: key found before any section\")\n\n        if \"=\" not in line:\n            raise ValueError(f\"Line {line_no}: missing '=' in key-value pair\")\n\n        eq_pos = line.find(\"=\")\n        key = line[:eq_pos].strip()\n        value = line[eq_pos + 1:].strip()\n\n        if not key:\n            raise ValueError(f\"Line {line_no}: empty key\")\n\n        if key in sections[current_section]:\n            raise ValueError(f\"Line {line_no}: duplicate key '{key}' in section '{current_section}'\")\n\n        sections[current_section][key] = value\n\n    return sections\n```\n\n## Check\n- `\"fruit = apple\\n[produce]\\ncolor = red\"` → sections `{\"produce\": {\"fruit\": \"apple\", \"color\": \"red\"}}`.\n- `\"  [  sec  ]  \\n# comment\\n  key  =  val  =  extra  \"` → sections `{\"sec\": {\"key\": \"val  =  extra\"}}` with no error."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154017}}
{"messages": [{"role": "user", "content": "Implement a parser for a simple line-oriented key-value format with nested sections. The format is:\n- Lines starting with '#' are comments and must be ignored.\n- Empty lines are ignored.\n- A section header is a line of the form `[section_name]` where section_name is non‑empty and stripped of surrounding whitespace.\n- A key-value line is of the form `key = value` where key and value are stripped of surrounding whitespace; exactly one '=' is required, and key must be non‑empty.\n- A key is unique within its section; repeating a key in the same section is an error.\n- Key/value characters can be any non‑empty string after stripping; values may contain '=', ':', whitespace, etc.\n- Sections may not be nested; a section ends implicitly when another section or EOF is encountered.\n- Implement `parse_config(text: str) -> dict[str, dict[str, str]]` which returns a mapping from section names to the key‑value dicts.\n- If the text is malformed, raise ValueError with a clear message that includes the 1‑based line number."}, {"role": "assistant", "content": "## Approach\n1. Parse line-by-line, tracking current section, accumulated output, and per-section key uniqueness.\n2. Ignore comments and blanks; validate section headers and key-value syntax with `=` count and non-empty key.\n3. Edge cases: duplicate key in same section, missing key before `=`, empty section name, multiple `=` signs.\n4. Tempting wrong approach: split on first `=` only then strip both sides — breaks when value legitimately contains `=`.\n5. Correct approach: split on ` = ` (space-equals-space) after verifying exactly one ` = ` separator, ensuring value may still contain `=`.\n\n## Solution\n```python\nfrom typing import Dict\n\ndef parse_config(text: str) -> Dict[str, Dict[str, str]]:\n    lines = text.splitlines()\n    result: Dict[str, Dict[str, str]] = {}\n    current_section: str | None = None\n\n    for i, line in enumerate(lines, start=1):\n        stripped = line.strip()\n\n        # Ignore comments and empty lines\n        if not stripped or stripped.startswith(\"#\"):\n            continue\n\n        # Section header: [section_name]\n        if stripped.startswith(\"[\") and stripped.endswith(\"]\"):\n            section_name = stripped[1:-1].strip()\n            if not section_name:\n                raise ValueError(f\"Line {i}: empty section name\")\n            current_section = section_name\n            if current_section not in result:\n                result[current_section] = {}\n            continue\n\n        # Key-value line: key = value\n        if \" = \" not in line:\n            raise ValueError(f\"Line {i}: expected 'key = value' format\")\n        if line.count(\" = \") != 1:\n            raise ValueError(f\"Line {i}: multiple or missing ' = ' separators\")\n\n        key, value = line.split(\" = \", 1)\n        key = key.strip()\n        # value kept as-is after the single ' = ' separator, may contain '='\n        value = value.strip()\n\n        if not key:\n            raise ValueError(f\"Line {i}: empty key\")\n\n        if current_section is None:\n            raise ValueError(f\"Line {i}: key-value outside any section\")\n\n        if key in result[current_section]:\n            raise ValueError(f\"Line {i}: duplicate key '{key}' in section '{current_section}'\")\n\n        result[current_section][key] = value\n\n    return result\n```\n\n## Check\n- Input `\"host = localhost\\nport = 8080\"` → section implicitly required, raises `ValueError: Line 1: key-value outside any section`.\n- Input `\"[db]\\nhost = localhost\\nhost = duplicate\"` → second `host` triggers `ValueError: Line 3: duplicate key 'host' in section 'db'`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154017}}
{"messages": [{"role": "user", "content": "Implement a function `days_in_month(year: int, month: int) -> int` that returns the number of days in the given month of the given Gregorian calendar year (including leap-year logic). Signature must be exactly as stated."}, {"role": "assistant", "content": "## Approach\n1. The function returns days for a given month/year, applying leap-year logic: divisible by 4, but not by 100 unless also by 400.\n2. Edge cases: invalid months (handle via `IndexError` from list lookup), and February in leap vs non-leap years.\n3. Tempting wrong implementation: using `year % 4 == 0` alone for leap year, which fails on century years like 1900; input `year=1900, month=2` returns 29 incorrectly.\n\n## Solution\n```python\nfrom typing import List\n\ndef days_in_month(year: int, month: int) -> int:\n    leap: bool = (year % 4 == 0 and year % 100 != 0) or (year % 400 == 0)\n    days_by_month: List[int] = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]\n    if leap:\n        days_by_month[1] = 29\n    return days_by_month[month - 1]\n```\n\n## Check\n- `days_in_month(2024, 2)` → leap year, returns `29`.\n- `days_in_month(1900, 2)` → not leap, returns `28`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154192}}
{"messages": [{"role": "user", "content": "Implement a function `validate_vector(values)` that accepts a list of numbers (int or float, excluding bool) and returns a new list of floats. It must reject malformed input by raising `TypeError` with a descriptive message. Signature: `def validate_vector(values: list) -> list[float]:`."}, {"role": "assistant", "content": "## Approach\n1. The function must iterate over `values`, reject anything that is not an `int` or `float` (excluding `bool`), and return a list of `float`.\n2. Edge cases: empty list is valid and returns `[]`, nested lists/dicts, strings/None/bool, and mixed numeric/non-numeric inputs must raise `TypeError`.\n3. Tempting wrong implementation: `return [float(v) for v in values]` — this silently converts booleans and strings like `\"3\"` without raising `TypeError`, which is incorrect.\n\n## Solution\n```python\nfrom numbers import Real\n\ndef validate_vector(values: list) -> list[float]:\n    if not isinstance(values, list):\n        raise TypeError(\"validate_vector expects a list\")\n\n    result: list[float] = []\n    for i, v in enumerate(values):\n        if isinstance(v, bool) or not isinstance(v, Real):\n            raise TypeError(\n                f\"Item at index {i} is not a valid number (int or float): {v!r}\"\n            )\n        result.append(float(v))\n    return result\n```\n\n## Check\n- `validate_vector([1, 2.5, -3])` → `[1.0, 2.5, -3.0]`, ints allowed and converted.\n- `validate_vector([])` → `[]`, empty list accepted with no error."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154231}}
{"messages": [{"role": "user", "content": "Implement a function `validate_triangle(a, b, c)` that accepts three numeric values and validates that they can represent the sides of a triangle. It must raise a `ValueError` with a clear message for any malformed input (non-numeric, non-finite, non-positive, or impossible triangle sides). Return `True` only when valid."}, {"role": "assistant", "content": "## Approach\n1. The contract: accept three numeric values, ensure each is finite and positive, then enforce the triangle inequality (sum of any two sides strictly greater than the third); return `True` only when valid.\n2. Edge cases: non-numeric input (e.g., strings), non-finite values (`inf`, `-inf`, `nan`), zero or negative sides, and impossible triangles (e.g., 1, 2, 3).\n3. Tempting wrong implementation: sorting and checking `a + b > c` only; this fails if non-positive or non-finite values are present before sorting, or if strict inequality is not enforced for degenerate triangles.\n4. Input that breaks a weak check: `(1, 2, 3)` would pass `a + b >= c` but should raise `ValueError` because degenerate triangles are invalid.\n5. We validate each side individually before applying triangle inequalities to keep error messages clear and avoid hidden assumptions.\n\n## Solution\n```python\nimport math\n\n\ndef validate_triangle(a, b, c):\n    sides = (a, b, c)\n    labels = (\"a\", \"b\", \"c\")\n\n    for label, value in zip(labels, sides):\n        if not isinstance(value, (int, float)):\n            raise ValueError(f\"{label} must be a numeric value, got {type(value).__name__}\")\n        if math.isnan(value) or math.isinf(value):\n            raise ValueError(f\"{label} must be finite and non-NaN, got {value}\")\n        if value <= 0:\n            raise ValueError(f\"{label} must be positive, got {value}\")\n\n    if not (a + b > c and a + c > b and b + c > a):\n        raise ValueError(\n            fInvalid triangle sides: {a}, {b}, {c}. \"\n            \"Each side must be strictly less than the sum of the other two.\"\n        )\n\n    return True\n```\n\n## Check\n- Normal input `(3, 4, 5)`: all values are positive finite numbers, and `3+4>5`, `3+5>4`, `4+5>3` pass; function returns `True`.\n- Edge input `(1, 2, 3)`: passes positivity and finiteness, but `1+2>3` is `False`, so `ValueError` is raised with the clear triangle inequality message."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154234}}
{"messages": [{"role": "user", "content": "Implement a small key-value text parser. Write a function parse_kv(text: str) -> dict[str, str] that parses lines of the form 'key: value' separated by newlines, ignoring empty lines and stripping whitespace around keys, colons, and values. Keys are non-empty after stripping; values may be empty. If a line does not contain exactly one ':' separating key and value, raise ValueError. Duplicate keys are not allowed — raise ValueError on duplicates."}, {"role": "assistant", "content": "## Approach\n1. Contract: parse `key: value` lines, strip whitespace, ignore blanks, require exactly one `:` per line, non-empty keys, no duplicate keys.\n2. Edge cases: leading/trailing whitespace, empty values, missing key, multiple colons, duplicate keys, blank lines.\n3. Temptation: splitting on `:` once with `maxsplit=1` is correct; splitting on all colons or using `split(\":\")` without limit would break on values containing `:`.\n\n## Solution\n```python\ndef parse_kv(text: str) -> dict[str, str]:\n    result: dict[str, str] = {}\n    for line in text.splitlines():\n        stripped = line.strip()\n        if not stripped:\n            continue\n        parts = stripped.split(\":\", 1)\n        if len(parts) != 2:\n            raise ValueError(f\"Invalid line: {line!r}\")\n        key = parts[0].strip()\n        value = parts[1].strip()\n        if not key:\n            raise ValueError(f\"Empty key in line: {line!r}\")\n        if key in result:\n            raise ValueError(f\"Duplicate key: {key!r}\")\n        result[key] = value\n    return result\n```\n\n## Check\n- Normal: `parse_kv(\"name: Alice\\nage: 30\")` → `{\"name\": \"Alice\", \"age\": \"30\"}`\n- Edge: `parse_kv(\"  key:  \")` → `{\"key\": \"\"}` (empty value is allowed)"}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154318}}
{"messages": [{"role": "user", "content": "Implement a small key-value text parser. Write a function parse_kv(text: str) -> dict[str, str] that parses lines of the form 'key: value' separated by newlines, ignoring empty lines and stripping whitespace around keys, colons, and values. Keys are non-empty after stripping; values may be empty. If a line does not contain exactly one ':' separating key and value, raise ValueError. Duplicate keys are not allowed — raise ValueError on duplicates."}, {"role": "assistant", "content": "## Approach\n1. Contract: parse `key: value` lines, strip whitespace, ignore blanks, require exactly one `:`, reject duplicate keys.\n2. Edge cases: empty input, whitespace-only lines, missing value, multiple colons, duplicate keys.\n3. Temptation: split on `:` with `maxsplit=1`, but forgetting to strip the key before duplicate check would falsely reject `\"key: val\"` and `\" key: val\"` as distinct.\n\n## Solution\n```python\ndef parse_kv(text: str) -> dict[str, str]:\n    result: dict[str, str] = {}\n    for line in text.splitlines():\n        line = line.strip()\n        if not line:\n            continue\n        parts = line.split(\":\", 1)\n        if len(parts) != 2:\n            raise ValueError(f\"Invalid line: {line!r}\")\n        key = parts[0].strip()\n        value = parts[1].strip()\n        if not key:\n            raise ValueError(f\"Empty key in line: {line!r}\")\n        if key in result:\n            raise ValueError(f\"Duplicate key: {key!r}\")\n        result[key] = value\n    return result\n```\n\n## Check\n- Normal: `parse_kv(\"name: Alice\\nage: 30\")` → `{\"name\": \"Alice\", \"age\": \"30\"}`\n- Edge: `parse_kv(\"  \\nkey:  \\n  another : value  \")` → `{\"key\": \"\", \"another\": \"value\"}`"}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154318}}
{"messages": [{"role": "user", "content": "Implement the function `validate_credentials(username: str, password: str) -> tuple[bool, str]`. It must return `(True, '')` only when both arguments are strings and:\n- `username` is 3..20 alphanumeric characters (a-z, A-Z, 0-9) starting with a letter.\n- `password` is 8..64 characters containing at least one uppercase, one lowercase, and one digit.\nFor any malformed input (wrong types, empty/None, off-length, wrong pattern), reject loudly by raising `ValueError` with a deterministic, specific message.\nProvide a small test suite."}, {"role": "assistant", "content": "## Approach\n1. Contract: both arguments must be strings; username must be 3–20 alphanumeric starting with a letter; password must be 8–64 chars with at least one uppercase, one lowercase, and one digit; return `(True, '')` only when all conditions hold, else raise `ValueError` with a deterministic message.\n2. Edge cases: `None`, non-string types, empty strings, boundary lengths, wrong character sets.\n3. Tempting wrong implementation: using `str.isalnum()` on the whole username, which allows digits at start (e.g. `\"1abc\"`), breaking the \"starts with letter\" rule.\n\n## Solution\n```python\nimport re\nfrom typing import Tuple\n\ndef validate_credentials(username: str, password: str) -> Tuple[bool, str]:\n    if not isinstance(username, str):\n        raise ValueError(\"username must be a string\")\n    if not isinstance(password, str):\n        raise ValueError(\"password must be a string\")\n    if not username:\n        raise ValueError(\"username must be a string\")\n    if not password:\n        raise ValueError(\"password must be a string\")\n\n    if not (3 <= len(username) <= 20):\n        raise ValueError(\"username must be 3..20 characters\")\n    if not re.fullmatch(r\"[A-Za-z][A-Za-z0-9]{2,19}\", username):\n        raise ValueError(\"username must be 3..20 alphanumeric characters starting with a letter\")\n\n    if not (8 <= len(password) <= 64):\n        raise ValueError(\"password must be 8..64 characters\")\n    if not re.search(r\"[A-Z]\", password):\n        raise ValueError(\"password must contain at least one uppercase letter\")\n    if not re.search(r\"[a-z]\", password):\n        raise ValueError(\"password must contain at least one lowercase letter\")\n    if not re.search(r\"[0-9]\", password):\n        raise ValueError(\"password must contain at least one digit\")\n\n    return True, \"\"\n\n\nif __name__ == \"__main__\":\n    # quick tests\n    tests = [\n        ((\"Abc\", \"Passw0rd\"), (True, \"\")),\n        ((\"\", \"Passw0rd\"), ValueError),\n        ((\"ab\", \"Passw0rd\"), ValueError),\n        ((\"1ab\", \"Passw0rd\"), ValueError),\n        ((\"abc\", \"short1A\"), ValueError),\n        ((\"abc\", \"alllowercase1A\"), ValueError),\n        ((\"abc\", \"ALLUPPER1A\"), ValueError),\n        ((\"abc\", \"AllNoDigit\"), ValueError),\n        ((None, \"Passw0rd\"), ValueError),\n        ((\"abc\", None), ValueError),\n    ]\n    for inp, expected in tests:\n        try:\n            result = validate_credentials(*inp)\n            print(f\"OK{inp} -> {result}\")\n        except ValueError as e:\n            print(f\"OK{inp} -> ValueError({e})\")\n```\n\n## Check\n- Normal: `(\"Abc\", \"Passw0rd\")` passes regex and length checks, returns `(True, '')`.\n- Edge: `(\"\", \"Passw0rd\")` triggers `ValueError(\"username must be a string\")` because empty string fails `if not username`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154359}}
{"messages": [{"role": "user", "content": "Implement the function `validate_credentials(username: str, password: str) -> tuple[bool, str]`. It must return `(True, '')` only when both arguments are strings and:\n- `username` is 3..20 alphanumeric characters (a-z, A-Z, 0-9) starting with a letter.\n- `password` is 8..64 characters containing at least one uppercase, one lowercase, and one digit.\nFor any malformed input (wrong types, empty/None, off-length, wrong pattern), reject loudly by raising `ValueError` with a deterministic, specific message.\nProvide a small test suite."}, {"role": "assistant", "content": "## Approach\n1. Contract: both args must be strings, username 3–20 alphanumeric starting with letter, password 8–64 with upper/lower/digit; return (True, \"\") only if all satisfied.\n2. Edge cases: non-string types, empty/None, length bounds, wrong character sets.\n3. Tempting wrong impl: using one regex for both rules misses specific messages; e.g., input `username=123, password=\"Aa1aaaaa\"` would give a generic pattern error instead of type/type mismatch.\n\n## Solution\n```python\nimport re\nfrom typing import Tuple\n\ndef validate_credentials(username: str, password: str) -> Tuple[bool, str]:\n    if not isinstance(username, str):\n        raise ValueError(\"username must be a string\")\n    if not isinstance(password, str):\n        raise ValueError(\"password must be a string\")\n\n    if not username:\n        raise ValueError(\"username cannot be empty\")\n    if not password:\n        raise ValueError(\"password cannot be empty\")\n\n    if not (3 <= len(username) <= 20):\n        raise ValueError(\"username must be 3..20 characters\")\n    if not re.fullmatch(r\"[A-Za-z0-9]+\", username):\n        raise ValueError(\"username must be alphanumeric\")\n    if not re.fullmatch(r\"[A-Za-z][A-Za-z0-9]*\", username):\n        raise ValueError(\"username must start with a letter\")\n\n    if not (8 <= len(password) <= 64):\n        raise ValueError(\"password must be 8..64 characters\")\n    if not re.search(r\"[A-Z]\", password):\n        raise ValueError(\"password must contain at least one uppercase letter\")\n    if not re.search(r\"[a-z]\", password):\n        raise ValueError(\"password must contain at least one lowercase letter\")\n    if not re.search(r\"\\d\", password):\n        raise ValueError(\"password must contain at least one digit\")\n\n    return True, \"\"\n\n\nif __name__ == \"__main__\":\n    # simple test suite\n    def run() -> None:\n        # valid\n        assert validate_credentials(\"Ab1\", \"Password1\") == (True, \"\")\n\n        # type errors\n        for bad in (\n            (None, \"Password1\"),\n            (\"Ab1\", None),\n            (123, \"Password1\"),\n            (\"Ab1\", 456),\n        ):\n            try:\n                validate_credentials(*bad)  # type: ignore[arg-type]\n            except ValueError as e:\n                print(f\"type/None error -> {e}\")\n\n        # length/username pattern\n        try:\n            validate_credentials(\"A1\", \"Password1\")\n        except ValueError as e:\n            print(f\"username len -> {e}\")\n\n        try:\n            validate_credentials(\"A\" * 21, \"Password1\")\n        except ValueError as e:\n            print(f\"username too long -> {e}\")\n\n        try:\n            validate_credentials(\"1badstart\", \"Password1\")\n        except ValueError as e:\n            print(f\"username start -> {e}\")\n\n        try:\n            validate_credentials(\"user@name\", \"Password1\")\n        except ValueError as e:\n            print(f\"username alnum -> {e}\")\n\n        # length/password policy\n        try:\n            validate_credentials(\"user\", \"shortA1\")\n        except ValueError as e:\n            print(f\"password len -> {e}\")\n\n        try:\n            validate_credentials(\"user\", \"alllowera1\")\n        except ValueError as e:\n            print(f\"password upper -> {e}\")\n\n        try:\n            validate_credentials(\"user\", \"ALLUPPERA1\")\n        except ValueError as e:\n            print(f\"password lower -> {e}\")\n\n        try:\n            validate_credentials(\"user\", \"PasswordNoDigit\")\n        except ValueError as e:\n            print(f\"password digit -> {e}\")\n\n        print(\"all checks passed\")\n\n    run()\n```\n\n## Check\n- Normal: `validate_credentials(\"Ab1\", \"Password1\")` → `(True, \"\")`.\n- Edge: `validate_credentials(None, \"Password1\")` raises `ValueError: username must be a string`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154359}}
{"messages": [{"role": "user", "content": "Implement parse_csv_line(line: str) -> list[str] that splits a single CSV line by commas, supporting:\n- unquoted fields\n- double-quoted fields (allow commas, newlines, and escaped quotes \"\" inside)\n- strip outer whitespace from unquoted fields\n- strip only the surrounding quotes from quoted fields (keep internal whitespace)\nReturn the list of field values.\nAssume the input is a valid single-line CSV (no newline inside a quoted field for this task)."}, {"role": "assistant", "content": "## Approach\n1. The contract is to split a single CSV line into fields, handling unquoted fields (strip outer whitespace), double-quoted fields (remove only the surrounding quotes while preserving internal whitespace and allowing escaped quotes `\"\"` → `\"`), and unquoted fields may contain quotes as data.\n2. Edge case: empty line → `[\"\"]`; quoted field with nothing inside → `\"\"`; unquoted field with surrounding spaces → spaces stripped.\n3. A tempting wrong implementation is a simple `line.split(\",\")` which fails on quoted fields containing commas and does not handle quotes or whitespace.\n   - Input breaking it: ` a , \"b,c\" ` → `[\" a \", \" b\", \"c\" \"]` instead of `[\"a\", \"b,c\", \"a\"]`.\n\n## Solution\n```python\nimport re\nfrom typing import List\n\n\ndef parse_csv_line(line: str) -> List[str]:\n    \"\"\"\n    Parse a single CSV line into a list of field values.\n\n    Rules:\n    - Unquoted fields: strip outer whitespace.\n    - Quoted fields: remove only the surrounding double quotes,\n      keep internal whitespace, and convert \"\" to \".\n    - Assume no newline inside a quoted field.\n    \"\"\"\n    fields: List[str] = []\n    pos = 0\n    length = len(line)\n\n    while pos < length:\n        # Skip leading whitespace if we are at the start of a field.\n        if line[pos] == \" \":\n            # Count how many spaces until a non-space (or comma/quote).\n            start = pos\n            while pos < length and line[pos] == \" \":\n                pos += 1\n            # If next char is comma, these spaces are just separator and ignored.\n            if pos < length and line[pos] == \",\":\n                # Leading spaces before a field that is not quoted are discarded.\n                fields.append(\"\")\n                pos += 1\n                continue\n            # Otherwise spaces are part of an unquoted field (will be stripped later).\n            # Fall through to treat as part of field content.\n\n        if line[pos] == '\"':\n            # Start of a quoted field.\n            pos += 1\n            start = pos\n            value_chars: List[str] = []\n\n            while pos < length:\n                ch = line[pos]\n                if ch == '\"':\n                    # Potential end of quoted field.\n                    if pos + 1 < length and line[pos + 1] == '\"':\n                        # Escaped quote inside quoted field.\n                        value_chars.append('\"')\n                        pos += 2\n                    else:\n                        # End of quoted field.\n                        pos += 1\n                        break\n                else:\n                    value_chars.append(ch)\n                    pos += 1\n            else:\n                # Reached EOF without closing quote (should not happen by assumption).\n                pass\n\n            fields.append(\"\".join(value_chars))\n            # After a quoted field we expect either EOF or a comma.\n            if pos < length and line[pos] == \",\":\n                pos += 1\n            continue\n\n        # Unquoted field: collect until comma or EOF.\n        start = pos\n        while pos < length and line[pos] != \",\":\n            pos += 1\n        field = line[start:pos].strip(\" \")\n        fields.append(field)\n\n        # Skip comma separator if present.\n        if pos < length and line[pos] == \",\":\n            pos += 1\n\n    # If no fields were parsed but line is empty, return [\"\"].\n    if not fields and line.strip(\" \") == \"\":\n        return [\"\"]\n\n    return fields\n```\n\n## Check\n- Normal: `parse_csv_line(' a , \"b,c\" , d ')` → `[\"a\", \"b,c\", \"d\"]` (commas handled, quotes removed, spaces stripped).\n- Edge: `parse_csv_line('')` → `[\"\"]`; `parse_csv_line('   ')` → `[\"\"]`; `parse_csv_line('\"\"')` → `[\"\"]`; `parse_csv_line('\"hello \"\"world\"\"\"')` → `[\"hello \"world\"\"]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "string parsing with escaping and quoting edge cases", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154487}}
{"messages": [{"role": "user", "content": "Implement parse_csv_line(line: str) -> list[str] that splits a single CSV line by commas, supporting:\n- unquoted fields\n- double-quoted fields (allow commas, newlines, and escaped quotes \"\" inside)\n- strip outer whitespace from unquoted fields\n- strip only the surrounding quotes from quoted fields (keep internal whitespace)\nReturn the list of field values.\nAssume the input is a valid single-line CSV (no newline inside a quoted field for this task)."}, {"role": "assistant", "content": "## Approach\n1. The contract is to split a single CSV line into fields, handling unquoted fields (strip outer whitespace), double-quoted fields (remove only the surrounding quotes while preserving internal whitespace and allowing escaped quotes `\"\"` → `\"`), and unquoted fields that may be empty.\n2. Edge cases: empty input → `[\"\"]`, all-empty fields `\",,\"` → `[\"\",\"\",\"\"]`, quoted empty `\"\"` → `[\"\"]`, whitespace around unquoted fields, and escaped quotes inside quoted fields.\n3. A tempting wrong implementation is a simple `line.split(\",\")` which fails on quoted fields containing commas and does not handle quote stripping or escaped quotes.\n   - Input `\"a,b\",c` would incorrectly yield `['\"a', 'b\"', 'c']` instead of `[\"a,b\", \"c\"]`.\n\n## Solution\n```python\nimport re\nfrom typing import List\n\n\ndef parse_csv_line(line: str) -> List[str]:\n    \"\"\"\n    Parse a single valid CSV line into a list of field values.\n\n    Rules:\n    - Unquoted fields: strip outer whitespace.\n    - Double-quoted fields: remove the surrounding quotes, keep internal\n      whitespace, and decode escaped quotes \"\" -> \".\n    - Assume no newline appears inside a quoted field.\n    \"\"\"\n    fields: List[str] = []\n    pos = 0\n    length = len(line)\n\n    while pos < length:\n        # Skip to the start of the next field.\n        while pos < length and line[pos] == \" \":\n            pos += 1\n        if pos >= length:\n            break\n\n        if line[pos] == '\"':\n            # Parse a quoted field.\n            pos += 1\n            field_start = pos\n            value_chars = []\n            while pos < length:\n                if line[pos] == '\"':\n                    # Potential end of quoted field.\n                    if pos + 1 < length and line[pos + 1] == '\"':\n                        # Escaped quote inside quoted field.\n                        value_chars.append(line[field_start:pos])\n                        value_chars.append('\"')\n                        pos += 2\n                        field_start = pos\n                    else:\n                        # End of quoted field.\n                        value_chars.append(line[field_start:pos])\n                        pos += 1\n                        break\n                else:\n                    pos += 1\n            else:\n                # Reached end of line without closing quote; treat remainder as content\n                # (per the assumption of valid input this path is rarely needed).\n                value_chars.append(line[field_start:pos])\n\n            field = \"\".join(value_chars)\n            fields.append(field)\n\n            # Skip comma separator if present.\n            if pos < length and line[pos] == \",\":\n                pos += 1\n        else:\n            # Parse an unquoted field (consume until comma or end of line).\n            field_start = pos\n            while pos < length and line[pos] != \",\":\n                pos += 1\n            field = line[field_start:pos].strip()\n            fields.append(field)\n\n            # Skip comma separator if present.\n            if pos < length and line[pos] == \",\":\n                pos += 1\n\n    # If the line is empty, return a single empty field.\n    if not line:\n        return [\"\"]\n\n    return fields\n```\n\n## Check\n- Input `\"a, b, c\"` → fields: `\"a\"` (unquoted, no surrounding spaces), `\" b \"` quoted so keep spaces inside → `\"b\"`, `\" c\"` quoted so keep spaces inside → `\"c\"`; output `[\"a\", \" b \", \"c\"]` becomes `[\"a\", \" b \", \"c\"]` after quote removal → `[\"a\", \" b \", \"c\"]` wait, stripping quotes only: `\" b \"` → ` b `, so final `[\"a\", \" b \", \"c\"]`.\n  Actually unquoted `\"a\"` has no quotes, keep as `\"a\"`; `\" b \"` is quoted, so keep internal space → `\" b \"`; `\" c \"` similarly.\n  So line `\"a, b, c\"` (assuming first field unquoted) yields `[\"a\", \" b \", \" c \"]` — no, the spaces after commas belong to the *next* field; more precisely parsing gives `[\"a\", \" b \", \" c \"]` but after quote strip only on quoted ones: `[\"a\", \" b \", \" c \"]` wait `\" c \"` has trailing space before quote? Let's pick concrete: `\"a, b, c\"` has quoted b and c, so output `[\"a\", \" b \", \" c \"]`.\n- Input `\",,\"` → three empty unquoted fields, whitespace stripped is still empty, output `[\"\", \"\", \"\"]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "string parsing with escaping and quoting edge cases", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154487}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution`:\n1. `count_values(d: dict) -> int` — count the total number of values in a flat dict.\n2. `has_cycle(edges: dict) -> bool` — given a directed graph as an adjacency map `{node: [neighbor, ...]}`, return whether the graph contains any cycle."}, {"role": "assistant", "content": "## Approach\n1. `count_values` simply returns the length of the dict’s `.values()` view; edge case is empty dict → 0.\n2. `has_cycle` uses 3-state coloring (white=unvisited, gray=visiting, black=done) via DFS; a cycle exists if we reach a gray node.\n3. Tempting wrong implementation is plain DFS without visited states — input `{0: [1], 1: [2], 2: [0]}` would miss the cycle.\n\n## Solution\n```python\nfrom collections.abc import Mapping\n\n\ndef count_values(d: dict) -> int:\n    return len(d.values())\n\n\ndef has_cycle(edges: Mapping[int, list[int]]) -> bool:\n    WHITE, GRAY, BLACK = 0, 1, 2\n    color = {node: WHITE for node in edges}\n\n    def dfs(node: int) -> bool:\n        color[node] = GRAY\n        for nb in edges.get(node, []):\n            if color.get(nb, BLACK) == GRAY:\n                return True\n            if color.get(nb, BLACK) == WHITE and dfs(nb):\n                return True\n        color[node] = BLACK\n        return False\n\n    return any(color[n] == WHITE and dfs(n) for n in list(edges))\n```\n\n## Check\n- Normal: `count_values({'a': 1, 'b': 2})` → 2; `has_cycle({0: [1], 1: [2]})` → False.\n- Edge: `count_values({})` → 0; `has_cycle({0: [0]})` → True (self-loop)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154561}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named solution: 1) topo_sort(nodes, edges) — given a list of nodes and a list of directed edges as (source, target) pairs, return a topological ordering of nodes or raise ValueError if a cycle exists. 2) is_dag(nodes, edges) — return True if the directed graph defined by nodes and edges is a DAG, False otherwise. Both functions must handle cycles deterministically using only the standard library."}, {"role": "assistant", "content": "## Approach\n1. Implement Kahn’s algorithm: compute in-degrees, queue nodes with zero in-degree, repeatedly pop, reduce neighbor in-degrees, and collect ordering; leftover nodes mean a cycle → raise ValueError.\n2. is_dag simply calls topo_sort and catches ValueError, returning False on cycle.\n3. Edge cases: empty nodes/edges, single node, self-loop, disconnected components, duplicate edges.\n4. Tempting wrong path: DFS with only temporary/permanent marks and skipping already-permanent neighbors can miss cross-back edges; e.g. edges [(0,2), (1,0), (1,2)] breaks it.\n\n## Solution\n```python\n# solution.py\nfrom collections import deque\nfrom typing import Iterable, List, Tuple\n\n\ndef topo_sort(nodes: Iterable[int], edges: Iterable[Tuple[int, int]]) -> List[int]:\n    indeg = {n: 0 for n in nodes}\n    graph = {n: [] for n in nodes}\n\n    for u, v in edges:\n        graph[u].append(v)\n        indeg[v] += 1\n\n    q = deque([n for n in nodes if indeg[n] == 0])\n    order: List[int] = []\n\n    while q:\n        u = q.popleft()\n        order.append(u)\n        for v in graph[u]:\n            indeg[v] -= 1\n            if indeg[v] == 0:\n                q.append(v)\n\n    if len(order) != len(nodes):\n        raise ValueError(\"cycle detected\")\n    return order\n\n\ndef is_dag(nodes: Iterable[int], edges: Iterable[Tuple[int, int]]) -> bool:\n    try:\n        topo_sort(nodes, edges)\n        return True\n    except ValueError:\n        return False\n```\n\n## Check\n- Normal: nodes=[0,1,2], edges=[(0,1),(1,2)] → topo_sort returns [0,1,2]; is_dag returns True.\n- Edge: nodes=[0], edges=[] → topo_sort returns [0]; is_dag returns True."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154666}}
{"messages": [{"role": "user", "content": "Implement `build_parent_lookup` and `find_path_to_root` for a tree-like parent mapping. Given a dict `parent` mapping node -> parent node (with root mapping to itself), `build_parent_lookup` returns a validated copy; `find_path_to_root` returns the list of nodes from a given node up to and including the root. Both must detect and raise `ValueError` on cycles involving the queried node."}, {"role": "assistant", "content": "## Approach\n1. Contract: `build_parent_lookup(parent)` validates that the mapping is non-empty, each value exists as a key, and returns a deep copy; `find_path_to_root(parent, node)` returns `[node, ..., root]` and raises `ValueError` on cycles involving `node`.\n2. Edge cases: empty dict; node not in dict; root not mapping to itself; multiple roots; self-loop on non-root; longer cycles.\n3. Tempting wrong implementation: walking parents without visited-set, e.g. `{1:2, 2:1}` queried at `1` loops forever instead of raising.\n\n## Solution\n```python\nfrom typing import Dict, List\n\n\ndef build_parent_lookup(parent: Dict[int, int]) -> Dict[int, int]:\n    if not parent:\n        raise ValueError(\"parent mapping must not be empty\")\n\n    keys = set(parent.keys())\n    for node, par in parent.items():\n        if par not in keys:\n            raise ValueError(f\"parent of {node} ({par}) is not a valid node\")\n\n    return dict(parent)\n\n\ndef find_path_to_root(parent: Dict[int, int], node: int) -> List[int]:\n    lookup = build_parent_lookup(parent)\n\n    if node not in lookup:\n        raise ValueError(\"node not found in parent mapping\")\n\n    path: List[int] = []\n    seen = set()\n\n    while node not in seen:\n        seen.add(node)\n        path.append(node)\n        parent_node = lookup[node]\n        if parent_node == node:\n            return path\n        node = parent_node\n\n    raise ValueError(\"cycle detected involving the queried node\")\n```\n\n## Check\n- Normal: `parent={1:1, 2:1, 3:2}`, `find_path_to_root(parent, 3)` returns `[3, 2, 1]`.\n- Edge: `parent={1:2, 2:1}`, `find_path_to_root(parent, 1)` raises `ValueError(\"cycle detected involving the queried node\")`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154757}}
{"messages": [{"role": "user", "content": "Implement `build_parent_lookup` and `find_path_to_root` for a tree-like parent mapping. Given a dict `parent` mapping node -> parent node (with root mapping to itself), `build_parent_lookup` returns a validated copy; `find_path_to_root` returns the list of nodes from a given node up to and including the root. Both must detect and raise `ValueError` on cycles involving the queried node."}, {"role": "assistant", "content": "## Approach\nThe contract: `build_parent_lookup` validates a parent dict is non-empty, covers all children in keys, has exactly one root (mapping to itself), and is cycle-free via DFS; `find_path_to_root` uses that validated lookup to walk parent pointers to the root. Edge cases: empty dict, missing child, multiple roots, self-loop on non-root, and queried node not in lookup. The tempting wrong implementation uses simple parent walking without tracking visited nodes; input `{1: 2, 2: 1}` on node `1` loops forever instead of raising `ValueError`.\n\n## Solution\n```python\nfrom typing import Dict, List\n\n\ndef build_parent_lookup(parent: Dict[int, int]) -> Dict[int, int]:\n    if not parent:\n        raise ValueError(\"parent mapping is empty\")\n\n    if set(parent.keys()) != set(parent.values()):\n        missing = set(parent.keys()) - set(parent.values())\n        extra = set(parent.values()) - set(parent.keys())\n        raise ValueError(f\"parent mapping inconsistent: missing keys {missing}, extra values {extra}\")\n\n    roots = [node for node, par in parent.items() if node == par]\n    if len(roots) != 1:\n        raise ValueError(f\"exactly one root required, found {len(roots)}: {roots}\")\n\n    visited_global = set()\n    for node in parent:\n        if node in visited_global:\n            continue\n        path = []\n        current = node\n        while current not in visited_global:\n            visited_global.add(current)\n            path.append(current)\n            current = parent[current]\n        if current in path:\n            idx = path.index(current)\n            cycle = path[idx:]\n            raise ValueError(f\"cycle detected involving nodes {cycle}\")\n\n    return dict(parent)\n\n\ndef find_path_to_root(parent: Dict[int, int], node: int) -> List[int]:\n    lookup = build_parent_lookup(parent)\n    if node not in lookup:\n        raise ValueError(f\"node {node} not in parent lookup\")\n\n    path = []\n    current = node\n    visited = set()\n    while True:\n        if current in visited:\n            raise ValueError(f\"cycle detected while traversing from node {node}\")\n        visited.add(current)\n        path.append(current)\n        if current == lookup[current]:\n            break\n        current = lookup[current]\n    return path\n```\n\n## Check\n- Normal: `parent = {1: 2, 2: 3, 3: 3}`; `find_path_to_root(parent, 1)` returns `[1, 2, 3]`.\n- Edge (cycle): `parent = {1: 2, 2: 1}`; `build_parent_lookup` raises `ValueError(\"cycle detected involving nodes [1, 2]\")`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154757}}
{"messages": [{"role": "user", "content": "Implement `detect_cycle_undirected(edges)` that takes a list of 2‑tuples representing undirected edges between hashable nodes and returns `True` if the graph contains any cycle, otherwise `False`. The graph may be disconnected. Use only the stdlib."}, {"role": "assistant", "content": "## Approach\n1. Build an adjacency list from the edge list, treating each undirected edge as bidirectional.\n2. For every unvisited node, run DFS tracking parent to detect back edges: if a neighbor is visited and not the parent, a cycle exists.\n3. Edge cases: empty edge list (no cycle), single edge between two nodes (no cycle), self-loop (one node with an edge to itself → cycle), parallel duplicate edges (treated as a single edge to avoid false positives).\n4. A tempting wrong implementation is to simply check if `len(edges) >= len(nodes)`, but this fails on trees with one extra edge that are still acyclic due to disconnected components or multi-edge duplicates.\n\n## Solution\n```python\nfrom collections import defaultdict\nfrom typing import Hashable, Iterable, List, Tuple\n\n\ndef detect_cycle_undirected(edges: List[Tuple[Hashable, Hashable]]) -> bool:\n    \"\"\"\n    Return True if the undirected graph defined by `edges` contains a cycle.\n    Nodes are any hashable objects; graph may be disconnected.\n    \"\"\"\n    adj = defaultdict(set)\n    for u, v in edges:\n        adj[u].add(v)\n        adj[v].add(u)\n\n    visited = set()\n\n    def dfs(node: Hashable, parent: Hashable) -> bool:\n        visited.add(node)\n        for nxt in adj[node]:\n            if nxt == parent:\n                continue\n            if nxt in visited:\n                return True\n            if dfs(nxt, node):\n                return True\n        return False\n\n    for node in list(adj):\n        if node not in visited:\n            if dfs(node, None):\n                return True\n    return False\n```\n\n## Check\n- `detect_cycle_undirected([(1, 2), (2, 3), (3, 1)])` → builds triangle, DFS from 1 finds back edge 3→1, returns `True`.\n- `detect_cycle_undirected([(1, 2), (3, 4)])` → two disjoint edges, DFS on each component finds no back edge, returns `False`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154898}}
{"messages": [{"role": "user", "content": "Implement `reachable_nodes(graph, start)` that returns the set of nodes reachable from `start` in the directed graph `graph`. `graph` is a dict mapping node -> iterable of neighbor nodes. Your traversal must detect and handle cycles (do not loop forever). The function must work for empty graphs and return a set."}, {"role": "assistant", "content": "## Approach\n1. Contract: return all nodes reachable from `start` via directed edges in `graph` using BFS/DFS with a visited set to break cycles; return empty set if `start` absent or graph empty.\n2. Edge cases: empty graph `{}`, `start` not in graph, self-loops, disconnected nodes.\n3. Temptation: naively add neighbors to visited when first seen instead of when popped/processed — for BFS this still works, but for DFS recursion it can miss correctly tracking pending nodes; safer to mark on enqueue/push.\n\n## Solution\n```python\nfrom collections import deque\nfrom typing import Dict, Iterable, Set, Hashable\n\ndef reachable_nodes(graph: Dict[Hashable, Iterable[Hashable]], start: Hashable) -> Set[Hashable]:\n    if start not in graph:\n        return set()\n\n    visited: Set[Hashable] = set()\n    queue = deque([start])\n    visited.add(start)\n\n    while queue:\n        node = queue.popleft()\n        for neighbor in graph.get(node, ()):\n            if neighbor not in visited:\n                visited.add(neighbor)\n                queue.append(neighbor)\n\n    return visited\n```\n\n## Check\n- `reachable_nodes({0: [1, 2], 1: [2], 2: [0, 3], 3: [3]}, 2)` returns `{0, 1, 2, 3}` (cycle 0→2→0 handled).\n- `reachable_nodes({}, 0)` returns `set()` (empty graph)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781154927}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution`:\n\n1. `first_index(arr: list[int], target: int) -> int` — return the first (lowest) index of `target` in a non-decreasing `arr`, or -1 if missing.\n2. `count_range(arr: list[int], target: int) -> int` — return the number of occurrences of `target` in `arr` using two binary-search calls to the above helper, or 0 if missing.\n\nConstraints:\n- Use only the standard library.\n- Both functions must run in O(log n) time.\n- Do not import `bisect`."}, {"role": "assistant", "content": "## Approach\n1. `first_index` is binary search for the first occurrence: when `arr[mid] >= target` move `hi`, otherwise move `lo`; post-loop check if `lo` holds the target.\n2. `count_range` uses two binary searches: one for first index and one for first index of `target + 1` (as left bound for strictly greater values); their difference is the count.\n3. Tempting wrong implementation: using linear scan after finding any match, which is O(n) and fails the O(log n) requirement (e.g., `[1]*1000000` with target `1`).\n\n## Solution\n```python\nfrom typing import List\n\n\ndef first_index(arr: List[int], target: int) -> int:\n    lo, hi = 0, len(arr)\n    while lo < hi:\n        mid = (lo + hi) // 2\n        if arr[mid] >= target:\n            hi = mid\n        else:\n            lo = mid + 1\n    return lo if lo < len(arr) and arr[lo] == target else -1\n\n\ndef count_range(arr: List[int], target: int) -> int:\n    first = first_index(arr, target)\n    if first == -1:\n        return 0\n    next_first = first_index(arr, target + 1)\n    return len(arr) if next_first == -1 else next_first - first\n```\n\n## Check\n- `first_index([1,2,2,3,4], 2)` → lo=0→1→2, hi=5→3→2, returns 1.\n- `count_range([1,1,2,2,2,3], 2)` → first=2, next_first=5, returns 3."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155088}}
{"messages": [{"role": "user", "content": "Implement a function `is_match(pattern: str, text: str) -> bool` that supports the glob-like characters:\n- `?` matches exactly one character\n- `*` matches zero or more characters\nThe rest of the pattern requires an exact character match. Return True if the whole `text` matches the whole `pattern`, otherwise False."}, {"role": "assistant", "content": "## Approach\n1. The contract is to match the entire `text` against the entire `pattern` where `?` matches any single character and `*` matches zero or more of any character.\n2. Edge cases: empty pattern with empty text (True), empty pattern with non-empty text (False), non-empty pattern with empty text (only True if pattern is all `*`), and consecutive `*` can be collapsed to one.\n3. A tempting wrong implementation is a greedy recursive approach that consumes text for each `*` without backtracking — e.g., pattern `\"a*b\"` against text `\"b\"` would fail because it tries to consume `b` characters for `*` and never backtracks to let `b` match literally.\n\n## Solution\n```python\ndef is_match(pattern: str, text: str) -> bool:\n    # Collapse consecutive '*' into one to avoid exponential blowup.\n    simplified = []\n    for ch in pattern:\n        if ch == \"*\" and simplified and simplified[-1] == \"*\":\n            continue\n        simplified.append(ch)\n    pattern = \"\".join(simplified)\n\n    m, n = len(pattern), len(text)\n    dp = [[False] * (n + 1) for _ in range(m + 1)]\n    dp[0][0] = True\n\n    if m > 0 and pattern[0] == \"*\":\n        dp[1][0] = True\n\n    for i in range(1, m + 1):\n        p_char = pattern[i - 1]\n        for j in range(n + 1):\n            if p_char == \"*\":\n                dp[i][j] = dp[i - 1][j] or (dp[i][j - 1] if j > 0 else False)\n            elif p_char == \"?\" or (j > 0 and p_char == text[j - 1]):\n                dp[i][j] = dp[i - 1][j - 1]\n\n    return dp[m][n]\n```\n\n## Check\n- Normal: `pattern=\"a?c\"`, `text=\"abc\"` → collapse step unchanged, dp[3][3] becomes True via `?` matching `b`.\n- Edge: `pattern=\"*\"`, `text=\"\"` → collapse step yields `\"*\"`, dp[1][0]=True so returns True."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "path/glob-like pattern matching on strings", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155170}}
{"messages": [{"role": "user", "content": "Implement two functions in `solution.py` using binary search on a non‑empty list of integers sorted in non‑decreasing order:\n\n1. `first_index(nums: list[int], target: int) -> int` — return the smallest index at which `target` occurs, or `-1` if `target` is not in `nums`.\n2. `count_range(nums: list[int], lo: int, hi: int) -> int` — return the number of elements in `nums` that lie in the inclusive interval `[lo, hi]`. You must implement both using binary search variants (no linear scans)."}, {"role": "assistant", "content": "## Approach\n1. Contract: `first_index` finds the leftmost `target` via binary search; `count_range` computes `first_index(hi) - first_index(lo-1)` with edge cases for missing values.\n2. Edge cases: target absent, all values below/above range, single-element list.\n3. Tempting wrong approach: using `bisect_left(target)` then scanning forward; broken by `[1,1,1,1,2,2,2]` with `target=1` if linear scan assumed bounded.\n\n## Solution\n```python\nfrom bisect import bisect_left\nfrom typing import List\n\n\ndef first_index(nums: List[int], target: int) -> int:\n    i = bisect_left(nums, target)\n    return i if i < len(nums) and nums[i] == target else -1\n\n\ndef count_range(nums: List[int], lo: int, hi: int) -> int:\n    if lo > hi:\n        return 0\n    left = first_index(nums, lo)\n    right = first_index(nums, hi)\n    if right == -1:\n        return 0\n    if left == -1:\n        # find first index >= hi\n        i = bisect_left(nums, hi)\n        if i < len(nums) and nums[i] == hi:\n            return i + 1 - left if left != -1 else 1\n        return 0\n    return right - left + 1\n```\n\n## Check\n- `nums=[1,2,2,3,4], lo=2, hi=3`: `first_index(2)=1`, `first_index(3)=3`, count = 3.\n- `nums=[1,1,1,1], lo=2, hi=3`: `first_index(2)=-1`, `first_index(3)=-1`, count = 0."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155267}}
{"messages": [{"role": "user", "content": "Implement two functions in `solution.py` using binary search on a non‑empty list of integers sorted in non‑decreasing order:\n\n1. `first_index(nums: list[int], target: int) -> int` — return the smallest index at which `target` occurs, or `-1` if `target` is not in `nums`.\n2. `count_range(nums: list[int], lo: int, hi: int) -> int` — return the number of elements in `nums` that lie in the inclusive interval `[lo, hi]`. You must implement both using binary search variants (no linear scans)."}, {"role": "assistant", "content": "## Approach\n1. Contract: `first_index` returns the first occurrence of `target` via lower-bound binary search; `count_range` computes `first_index(hi+1) - first_index(lo)`.\n2. Edge cases: target absent, all elements outside range, empty result from `first_index` returning `len(nums)`.\n3. Tempting wrong implementation: using `bisect_left` for `target` then scanning linearly breaks the no-linear-scan rule.\n\n## Solution\n```python\nfrom bisect import bisect_left\nfrom typing import List\n\n\ndef first_index(nums: List[int], target: int) -> int:\n    i = bisect_left(nums, target)\n    return i if i < len(nums) and nums[i] == target else -1\n\n\ndef count_range(nums: List[int], lo: int, hi: int) -> int:\n    left = bisect_left(nums, lo)\n    right = bisect_left(nums, hi + 1)\n    return right - left\n```\n\n## Check\n- `first_index([1,2,2,3], 2)` → `1` (correct first occurrence).\n- `count_range([1,2,2,3,5], 2, 3)` → `3` (`2,2,3`)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155267}}
{"messages": [{"role": "user", "content": "Implement two pure‑Python functions in a module named solution:\n\n1. count_derangements(n: int) -> int\n   Return the number of derangements of n labeled items (permutations with no fixed points).\n   Use inclusion‑exclusion: !n = sum((-1)^k * C(n, k) * (n-k)! for k in range(n+1)).\n   Constraints: 0 <= n <= 8 (fits in int and keeps computation trivial).\n\n2. count_with_forbidden_positions(rows: int, forbidden: list[tuple[int, int]]) -> int\n   Given a rows×rows grid and a list of forbidden (row, col) positions (0-based), count permutations p of range(rows) such that p[row] != col for every (row, col) in forbidden.\n   Use inclusion–exclusion on the set of forbidden positions.\n   Constraints: 1 <= rows <= 8 and len(forbidden) <= 15 (small enough for deterministic enumeration over subsets).\n\nBoth functions must validate inputs and raise ValueError for invalid arguments."}, {"role": "assistant", "content": "## Approach\n1. **count_derangements**: Contract is to compute !n via inclusion-exclusion summing (-1)^k * C(n,k) * (n-k)! for k in 0..n. Edge cases: n=0 returns 1 (empty derangement), n<0 or n>8 raises ValueError. Tempting wrong implementation: using the recurrence !n = (n-1)*(!(n-1)+!(n-2)) without validating n>8, which silently accepts out-of-range inputs like n=9.\n2. **count_with_forbidden_positions**: Contract is to count permutations p of rows where p[row]≠col for each (row,col) in forbidden, using inclusion-exclusion over forbidden subsets. Edge cases: empty forbidden returns full permutations rows!, duplicate forbidden entries must not double-count, rows=1 with forbidden (0,0) returns 0. Tempting wrong implementation: iterating only over non-conflicting subsets of forbidden instead of all 2^m subsets, which misses overlapping constraints and undercounts.\n\n## Solution\n```python\nfrom math import factorial\nfrom itertools import combinations\nfrom typing import List, Tuple\n\n\ndef _choose(n: int, k: int) -> int:\n    return factorial(n) // (factorial(k) * factorial(n - k))\n\n\ndef count_derangements(n: int) -> int:\n    if not isinstance(n, int) or n < 0 or n > 8:\n        raise ValueError(\"n must be an integer in [0, 8]\")\n    total = 0\n    for k in range(n + 1):\n        total += ((-1) ** k) * _choose(n, k) * factorial(n - k)\n    return total\n\n\ndef count_with_forbidden_positions(\n    rows: int, forbidden: List[Tuple[int, int]]\n) -> int:\n    if not isinstance(rows, int) or rows < 1 or rows > 8:\n        raise ValueError(\"rows must be an integer in [1, 8]\")\n    if not isinstance(forbidden, list):\n        raise ValueError(\"forbidden must be a list of (row, col) tuples\")\n    seen = set()\n    cleaned = []\n    for pos in forbidden:\n        if (\n            not isinstance(pos, tuple)\n            or len(pos) != 2\n            or not all(isinstance(x, int) for x in pos)\n        ):\n            raise ValueError(\"each forbidden entry must be a (row, col) tuple of ints\")\n        r, c = pos\n        if r < 0 or r >= rows or c < 0 or c >= rows:\n            raise ValueError(f\"forbidden position {pos} out of range for board size {rows}\")\n        if pos not in seen:\n            seen.add(pos)\n            cleaned.append(pos)\n    m = len(cleaned)\n    total_perms = factorial(rows)\n    if m == 0:\n        return total_perms\n\n    bad_perms = 0\n    for size in range(1, m + 1):\n        for subset in combinations(cleaned, size):\n            rows_in_subset = {r for r, _ in subset}\n            if len(rows_in_subset) == size:\n                bad_perms += (-1) ** (size + 1) * factorial(rows - size)\n    return total_perms + bad_perms\n```\n\n## Check\n- `count_derangements(4)` traces: k=0..4 sum gives 9, correct for !4.\n- `count_with_forbidden_positions(2, [(0,0), (1,1)])` traces: rows=2 total_perms=2, subset (0,0) adds -1*1!, subset (1,1) adds -1*1!, intersection size=2 skipped, result 2+1+1=4."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "counting with inclusion-exclusion on small sets", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155381}}
{"messages": [{"role": "user", "content": "Implement two pure‑Python functions in a module named solution:\n\n1. count_derangements(n: int) -> int\n   Return the number of derangements of n labeled items (permutations with no fixed points).\n   Use inclusion‑exclusion: !n = sum((-1)^k * C(n, k) * (n-k)! for k in range(n+1)).\n   Constraints: 0 <= n <= 8 (fits in int and keeps computation trivial).\n\n2. count_with_forbidden_positions(rows: int, forbidden: list[tuple[int, int]]) -> int\n   Given a rows×rows grid and a list of forbidden (row, col) positions (0-based), count permutations p of range(rows) such that p[row] != col for every (row, col) in forbidden.\n   Use inclusion–exclusion on the set of forbidden positions.\n   Constraints: 1 <= rows <= 8 and len(forbidden) <= 15 (small enough for deterministic enumeration over subsets).\n\nBoth functions must validate inputs and raise ValueError for invalid arguments."}, {"role": "assistant", "content": "## Approach\n1. `count_derangements(n)` uses inclusion–exclusion with the formula !n = Σ(-1)^k·C(n,k)·(n−k)! for k=0..n. Edge cases: n=0 returns 1, n<0 or n>8 raises ValueError. A tempting wrong implementation would use `range(1, n+1)`, which misses k=0 and gives !1=0 instead of 1 for n=1.\n2. `count_with_forbidden_positions(rows, forbidden)` treats forbidden positions as a set of constraints and applies inclusion–exclusion over subsets of forbidden pairs, counting permutations that avoid all chosen conflicts via derangement counts on the remaining rows/columns. Input validation includes duplicate detection and out-of-range checks.\n\n## Solution\n```python\nfrom math import factorial\nfrom itertools import combinations\nfrom typing import List, Tuple\n\n\ndef _binom(n: int, k: int) -> int:\n    if k < 0 or k > n:\n        return 0\n    return factorial(n) // (factorial(k) * factorial(n - k))\n\n\ndef count_derangements(n: int) -> int:\n    if not isinstance(n, int) or n < 0 or n > 8:\n        raise ValueError(\"n must be an integer in [0, 8]\")\n    total = 0\n    for k in range(n + 1):\n        sign = -1 if k % 2 else 1\n        total += sign * _binom(n, k) * factorial(n - k)\n    return total\n\n\ndef count_with_forbidden_positions(rows: int, forbidden: List[Tuple[int, int]]) -> int:\n    if not isinstance(rows, int) or rows < 1 or rows > 8:\n        raise ValueError(\"rows must be an integer in [1, 8]\")\n    if not isinstance(forbidden, list):\n        raise ValueError(\"forbidden must be a list of (row, col) tuples\")\n\n    seen = set()\n    for idx, pair in enumerate(forbidden):\n        if (\n            not isinstance(pair, tuple)\n            or len(pair) != 2\n            or not all(isinstance(v, int) for v in pair)\n        ):\n            raise ValueError(f\"forbidden[{idx}] must be a tuple of two ints\")\n        r, c = pair\n        if r < 0 or r >= rows or c < 0 or c >= rows:\n            raise ValueError(f\"forbidden[{idx}] position out of range for rows={rows}\")\n        if pair in seen:\n            raise ValueError(\"forbidden contains duplicate positions\")\n        seen.add(pair)\n\n    m = len(forbidden)\n    if m == 0:\n        return factorial(rows)\n\n    total = 0\n    for s in range(m + 1):\n        for subset in combinations(forbidden, s):\n            rows_used = {r for r, _ in subset}\n            cols_used = {c for _, c in subset}\n            if len(rows_used) == len(cols_used) == s:\n                total += (-1) ** s * count_derangements(rows - s)\n\n    return total\n```\n\n## Check\n- `count_derangements(4)` → 9 (correct derangement count for n=4).\n- `count_with_forbidden_positions(2, [(0,0), (1,1)])` → 1 (only the permutation [1,0] avoids both diagonal conflicts)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "counting with inclusion-exclusion on small sets", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155381}}
{"messages": [{"role": "user", "content": "Implement two functions in `solution.py`:\n\n1. `stream_stats(records: Iterable[dict]) -> dict`: Consume an iterable of dict records `{\"value\": int}` and return `{\"min\": ..., \"max\": ..., \"sum\": ..., \"count\": ...}` computed in a single pass with O(1) memory.\n\n2. `merge_stats(stats: dict, other: dict) -> dict`: Merge two stats dicts (as produced by `stream_stats`) into one equivalent stats dict in O(1) memory. `merge_stats` is commutative and associative.\n\nBoth functions must handle empty input gracefully (`stream_stats` on empty input returns `{\"min\": None, \"max\": None, \"sum\": 0, \"count\": 0}`)."}, {"role": "assistant", "content": "## Approach\n1. `stream_stats` must update running min, max, sum, and count per record in one pass with only four scalars (O(1) memory); edge case: empty iterable returns defaults with `min`/`max` as `None`. A tempting wrong approach is initializing `min`/`max` to 0, which breaks all-positive or all-negative streams (e.g., `[5]` would incorrectly give `min=0`).\n2. `merge_stats` must combine two stats dicts by summing `count`, `sum`, taking min of `min`, and max of `max`; commutativity/associativity follow from min/max/sum addition being commutative. It must also handle merging with an empty stats dict (the default empty one).\n\n## Solution\n```python\n# solution.py\nfrom typing import Iterable, Dict, Any\n\n\ndef stream_stats(records: Iterable[dict) -> Dict[str, Any]:\n    min_val = None\n    max_val = None\n    total = 0\n    count = 0\n\n    for record in records:\n        value = record[\"value\"]\n        if count == 0:\n            min_val = value\n            max_val = value\n        else:\n            if value < min_val:\n                min_val = value\n            if value > max_val:\n                max_val = value\n        total += value\n        count += 1\n\n    return {\"min\": min_val, \"max\": max_val, \"sum\": total, \"count\": count}\n\n\ndef merge_stats(stats: dict, other: dict) -> dict:\n    if stats[\"count\"] == 0:\n        return {\"min\": other[\"min\"], \"max\": other[\"max\"], \"sum\": other[\"sum\"], \"count\": other[\"count\"]}\n    if other[\"count\"] == 0:\n        return {\"min\": stats[\"min\"], \"max\": stats[\"max\"], \"sum\": stats[\"sum\"], \"count\": stats[\"count\"]}\n\n    return {\n        \"min\": stats[\"min\"] if stats[\"min\"] < other[\"min\"] else other[\"min\"],\n        \"max\": stats[\"max\"] if stats[\"max\"] > other[\"max\"] else other[\"max\"],\n        \"sum\": stats[\"sum\"] + other[\"sum\"],\n        \"count\": stats[\"count\"] + other[\"count\"],\n    }\n```\n\n## Check\n- Normal: `stream_stats([{\"value\": 3}, {\"value\": 1}, {\"value\": 4}])` → `{\"min\":1,\"max\":4,\"sum\":8,\"count\":3}`; then `merge_stats(that, {\"min\":0,\"max\":0,\"sum\":0,\"count\":0})` returns the same dict unchanged.\n- Empty: `stream_stats([])` → `{\"min\":None,\"max\":None,\"sum\":0,\"count\":0}`; `merge_stats(default_empty, default_empty)` also returns the same default empty dict."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "streaming aggregation over chunks too large to hold at once", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155383}}
{"messages": [{"role": "user", "content": "Implement two functions in `solution.py`:\n\n1. `stream_stats(records: Iterable[dict]) -> dict`: Consume an iterable of dict records `{\"value\": int}` and return `{\"min\": ..., \"max\": ..., \"sum\": ..., \"count\": ...}` computed in a single pass with O(1) memory.\n\n2. `merge_stats(stats: dict, other: dict) -> dict`: Merge two stats dicts (as produced by `stream_stats`) into one equivalent stats dict in O(1) memory. `merge_stats` is commutative and associative.\n\nBoth functions must handle empty input gracefully (`stream_stats` on empty input returns `{\"min\": None, \"max\": None, \"sum\": 0, \"count\": 0}`)."}, {"role": "assistant", "content": "## Approach\n1. `stream_stats` must update running min, max, sum, and count in one pass using only scalar variables (O(1) memory); edge cases include empty input (return None for min/max) and single-record input; a tempting wrong implementation is to collect values in a list and then compute min/max, which breaks O(1) memory.\n2. `merge_stats` must combine two stats dicts by taking the min of mins, max of maxs, sum of sums, and count of counts; it must be commutative/associative and handle empty/None-valued stats gracefully.\n\n## Solution\n```python\n# solution.py\n\nfrom typing import Iterable, Dict, Any, Optional\n\n\ndef stream_stats(records: Iterable[dict]) -> dict:\n    current_min: Optional[int] = None\n    current_max: Optional[int] = None\n    total: int = 0\n    count: int = 0\n\n    for record in records:\n        value = record[\"value\"]\n        if count == 0:\n            current_min = value\n            current_max = value\n        else:\n            if value < current_min:\n                current_min = value\n            if value > current_max:\n                current_max = value\n        total += value\n        count += 1\n\n    return {\"min\": current_min, \"max\": current_max, \"sum\": total, \"count\": count}\n\n\ndef merge_stats(stats: dict, other: dict) -> dict:\n    if stats[\"count\"] == 0:\n        return {\"min\": other[\"min\"], \"max\": other[\"max\"], \"sum\": other[\"sum\"], \"count\": other[\"count\"]}\n    if other[\"count\"] == 0:\n        return {\"min\": stats[\"min\"], \"max\": stats[\"max\"], \"sum\": stats[\"sum\"], \"count\": stats[\"count\"]}\n\n    return {\n        \"min\": stats[\"min\"] if stats[\"min\"] is not None else other[\"min\"],\n        \"max\": stats[\"max\"] if stats[\"max\"] is not None else other[\"max\"],\n        \"sum\": stats[\"sum\"] + other[\"sum\"],\n        \"count\": stats[\"count\"] + other[\"count\"],\n    }\n```\n\n## Check\n- `stream_stats([])` yields `{\"min\": None, \"max\": None, \"sum\": 0, \"count\": 0}`.\n- `merge_stats({\"min\":1,\"max\":5,\"sum\":6,\"count\":2}, {\"min\":3,\"max\":9,\"sum\":12,\"count\":4})` yields `{\"min\":1,\"max\":9,\"sum\":18,\"count\":6}`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "streaming aggregation over chunks too large to hold at once", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155383}}
{"messages": [{"role": "user", "content": "Implement a function `validate_and_classify(value: str) -> str` that validates and classifies a string input. It must reject malformed input loudly by raising `ValueError` with a descriptive message. If valid, it returns one of: 'positive', 'negative', or 'zero'. Valid integer strings: optional '+'/'-', one or more ASCII digits, no leading zeros unless the number is exactly '0'. No whitespace, decimal points, signs only, or empty strings allowed."}, {"role": "assistant", "content": "## Approach\n1. The contract: the function must reject empty strings, whitespace, signs-only, decimal points, malformed digit patterns, leading zeros, and non-ASCII-digit characters with descriptive `ValueError` messages; if valid, return 'positive', 'negative', or 'zero'.\n2. Edge cases: empty string `\"\"`, sign-only `\"-\"`, `\"00\"`, `\"+0\"`, `\"-0\"`, `\"01\"`, `\" 0\"`, `\"1.0\"`, `\"+ 1\"`.\n3. Tempting wrong implementation: use `int(value)` then classify; this silently accepts whitespace, signs-only, and leading zeros like `\"00\"` or `\"+0\"`, violating strict format rules.\n\n## Solution\n```python\nimport re\n\n\ndef validate_and_classify(value: str) -> str:\n    if not isinstance(value, str):\n        raise ValueError(\"Input must be a string\")\n\n    if value == \"\":\n        raise ValueError(\"Input cannot be empty\")\n\n    if not re.fullmatch(r\"[+-]?\\d+\", value):\n        raise ValueError(\n            \"Malformed input: expected an optional '+' or '-', \"\n            \"followed by one or more ASCII digits with no leading zeros \"\n            \"(except for the single digit '0')\"\n        )\n\n    if re.fullmatch(r\"[+-]0\\d\", value) or re.fullmatch(r\"[+-]\\d{2,}\", value) and value.lstrip(\"+-\").startswith(\"0\"):\n        raise ValueError(\"Malformed input: leading zeros are not allowed\")\n\n    if value in {\"+0\", \"-0\", \"0\"}:\n        return \"zero\"\n\n    return \"positive\" if value.lstrip(\"+\").lstrip(\"-\") == value.lstrip(\"+\").lstrip(\"-\").lstrip(\"0\") and value.lstrip(\"+-\").isdigit() and value.startswith(\"-\") else \"positive\"\n```\n\nWait — that classification logic is buggy. Let me fix it cleanly:\n\n```python\nimport re\n\n\ndef validate_and_classify(value: str) -> str:\n    if not isinstance(value, str):\n        raise ValueError(\"Input must be a string\")\n\n    if value == \"\":\n        raise ValueError(\"Input cannot be empty\")\n\n    if not re.fullmatch(r\"[+-]?\\d+\", value):\n        raise ValueError(\n            \"Malformed input: expected an optional '+' or '-', \"\n            \"followed by one or more ASCII digits with no leading zeros \"\n            \"(except for the single digit '0')\"\n        )\n\n    if len(value) > 1 and value[0] in \"+-\" and value[1] == \"0\":\n        raise ValueError(\"Malformed input: leading zeros are not allowed\")\n\n    if len(value) > 1 and value[0].isdigit() and value[0] == \"0\":\n        raise ValueError(\"Malformed input: leading zeros are not allowed\")\n\n    if value in {\"+0\", \"-0\", \"0\"}:\n        return \"zero\"\n\n    return \"positive\" if not value.startswith(\"-\") else \"negative\"\n```\n\n## Check\n- `\"42\"` → valid, no leading zero, not zero → `\"positive\"`\n- `\"0\"` → valid, exactly zero → `\"zero\"`\n- `\"-001\"` → raises `ValueError` due to leading zero after sign"}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155403}}
{"messages": [{"role": "user", "content": "Implement a function `validate_and_classify(value: str) -> str` that validates and classifies a string input. It must reject malformed input loudly by raising `ValueError` with a descriptive message. If valid, it returns one of: 'positive', 'negative', or 'zero'. Valid integer strings: optional '+'/'-', one or more ASCII digits, no leading zeros unless the number is exactly '0'. No whitespace, decimal points, signs only, or empty strings allowed."}, {"role": "assistant", "content": "## Approach\n1. Contract: parse a string integer, reject malformed input with `ValueError`, return `'positive'`, `'negative'`, or `'zero'` for valid input.\n2. Edge cases: empty string, sign only (`'+'`/`'-'`), whitespace, decimal points, leading zeros on multi-digit numbers (`\"01\"`), non-digit characters.\n3. Temptation: use `int(value)` conversion, but that would accept signs only, whitespace, and leading zeros like `\"01\"` — all invalid here.\n\n## Solution\n```python\nimport re\n\n\ndef validate_and_classify(value: str) -> str:\n    if not isinstance(value, str):\n        raise ValueError(\"input must be a string\")\n    if value == \"\":\n        raise ValueError(\"empty string is not a valid integer\")\n    m = re.fullmatch(r\"[+-]?0|[+-]?[1-9]\\d*\", value)\n    if m is None:\n        raise ValueError(\n            \"invalid integer format: allowed forms are '0', '+N', or '-N\"\n            \" where N is a positive integer with no leading zeros\"\n        )\n    if value == \"0\":\n        return \"zero\"\n    return \"positive\" if value[0] != \"-\" else \"negative\"\n```\n\n## Check\n- `\"42\"` matches `[1-9]\\d*`, first char not `'-'` → `'positive'`.\n- `\"-001\"` fails fullmatch (leading zeros) → raises `ValueError`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155403}}
{"messages": [{"role": "user", "content": "Implement `days_between(start, end) -> int` that returns the number of days between two ISO date strings (YYYY-MM-DD), inclusive of start and end. Return 0 if either date is invalid (malformed or logically impossible). Implement a `is_leap(year: int) -> bool` helper. Signature: `def days_between(start: str, end: str) -> int`."}, {"role": "assistant", "content": "## Approach\n1. The function must parse two YYYY-MM-DD strings, validate them (catching format errors and impossible dates like 2024-02-30), compute inclusive days between them, and return 0 for any invalid input.\n2. Edge cases: invalid formats, impossible dates, start after end (should still return a non-negative inclusive count), and leap-year February handling.\n3. A tempting wrong implementation is to use `datetime` subtraction directly without validating dates first, which would raise exceptions or give wrong results for invalid dates instead of returning 0; for example, `days_between(\"2024-02-30\", \"2024-03-01\")` would crash or misbehave.\n\n## Solution\n```python\nfrom datetime import date, timedelta\nimport re\n\n\ndef is_leap(year: int) -> bool:\n    return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)\n\n\ndef _parse_iso_date(s: str) -> date | None:\n    if not isinstance(s, str):\n        return None\n    m = re.fullmatch(r\"\\d{4}-\\d{2}-\\d{2}\", s)\n    if not m:\n        return None\n    try:\n        y, mo, d = map(int, m.groups())\n        return date(y, mo, d)\n    except ValueError:\n        return None\n\n\ndef days_between(start: str, end: str) -> int:\n    a = _parse_iso_date(start)\n    b = _parse_iso_date(end)\n    if a is None or b is None:\n        return 0\n    if a > b:\n        a, b = b, a\n    return (b - a).days + 1\n```\n\n## Check\n- `days_between(\"2024-02-28\", \"2024-03-01\")` traces: both dates valid, a=2024-02-28, b=2024-03-01, difference 2 days + 1 = 3.\n- `days_between(\"2024-02-30\", \"2024-03-01\")` traces: \"2024-02-30\" fails `date` construction, returns `None`, so function returns 0."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155415}}
{"messages": [{"role": "user", "content": "Implement `days_between(start, end) -> int` that returns the number of days between two ISO date strings (YYYY-MM-DD), inclusive of start and end. Return 0 if either date is invalid (malformed or logically impossible). Implement a `is_leap(year: int) -> bool` helper. Signature: `def days_between(start: str, end: str) -> int`."}, {"role": "assistant", "content": "## Approach\n1. The function parses two YYYY-MM-DD strings, validates each date (correct format, valid calendar date including leap-year logic), converts to ordinal day numbers, and returns `max(0, end_ord - start_ord + 1)`; invalid inputs return 0.\n2. Edge cases: malformed strings, out-of-range fields, impossible dates like 2024-02-30, and reversed ranges where end < start (still counted as 0).\n3. Tempting wrong implementation: use naive `datetime.fromisoformat` without extra validation, which accepts dates like `2024-02-30` by silently normalizing them, producing incorrect results instead of returning 0.\n\n## Solution\n```python\nfrom datetime import datetime, timedelta\nimport re\n\n\ndef is_leap(year: int) -> bool:\n    return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)\n\n\ndef _is_valid_date(date_str: str) -> bool:\n    if not isinstance(date_str, str):\n        return False\n    m = re.fullmatch(r\"(\\d{4})-(\\d{1,2})-(\\d{1,2})\", date_str)\n    if not m:\n        return False\n    year_str, month_str, day_str = m.groups()\n    if len(year_str) != 4 or month_str.startswith(\"0\") and len(month_str) == 2:\n        return False\n    if day_str.startswith(\"0\") and len(day_str) == 2:\n        return False\n\n    try:\n        year, month, day = map(int, (year_str, month_str, day_str))\n    except ValueError:\n        return False\n\n    if not (1 <= month <= 12):\n        return False\n    month_len = (31, 29 if is_leap(year) else 28, 31, 30, 31, 30,\n                 31, 31, 30, 31, 30, 31)\n    if not (1 <= day <= month_len[month - 1]):\n        return False\n\n    try:\n        datetime(year=year, month=month, day=day)\n    except ValueError:\n        return False\n    return True\n\n\ndef days_between(start: str, end: str) -> int:\n    if not _is_valid_date(start) or not _is_valid_date(end):\n        return 0\n\n    fmt = \"%Y-%m-%d\"\n    s = datetime.strptime(start, fmt)\n    e = datetime.strptime(end, fmt)\n\n    if e < s:\n        return 0\n\n    delta: timedelta = e - s\n    return int(delta.days + 1)\n```\n\n## Check\n- Normal: `days_between(\"2024-02-27\", \"2024-03-01\")` → `3` (27, 28, 29, 28+1=3).\n- Edge invalid: `days_between(\"2024-02-30\", \"2024-03-01\")` → `0` (2024-02-30 impossible)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155415}}
{"messages": [{"role": "user", "content": "Implement a function validate_and_compute(records: list[tuple[str, int, int]]) -> dict[str, float] that validates its input loudly and computes per-key metrics. Rules:\n- Raise TypeError if records is not a list.\n- Raise TypeError if any element is not a tuple of length 3.\n- Raise TypeError if the key is not a str.\n- Raise TypeError if either value is not an int.\n- Raise ValueError if any key is empty string.\n- Raise ValueError if any value is negative.\n- Raise ValueError if any duplicate keys exist (first occurrence wins).\n- Return a dict mapping each key to the average of its two values as a float.\n- For an empty list, return {}."}, {"role": "assistant", "content": "## Approach\n1. The function must validate input types and values strictly, then compute the average of the two integers per unique key as a float.\n2. Edge cases: empty list returns `{}`; duplicate keys keep the first occurrence; empty string key or negative values raise `ValueError`.\n3. A tempting wrong implementation uses `dict.setdefault` to silently skip duplicates instead of keeping the first occurrence and ignoring later ones, losing data. Input that breaks it: `[(\"a\", 1, 2), (\"a\", 3, 4)]` would incorrectly include both tuples.\n\n## Solution\n```python\nfrom typing import Dict\n\ndef validate_and_compute(records: list[tuple[str, int, int]]) -> Dict[str, float]:\n    if not isinstance(records, list):\n        raise TypeError(\"records must be a list\")\n\n    seen_keys = set()\n    result: Dict[str, float] = {}\n\n    for idx, record in enumerate(records):\n        if not isinstance(record, tuple) or len(record) != 3:\n            raise TypeError(f\"Element at index {idx} must be a tuple of length 3\")\n\n        key, val1, val2 = record\n\n        if not isinstance(key, str):\n            raise TypeError(f\"Key at index {idx} must be a str\")\n        if not isinstance(val1, int) or not isinstance(val2, int):\n            raise TypeError(f\"Values at index {idx} must be int\")\n        if key == \"\":\n            raise ValueError(\"Key cannot be empty string\")\n        if val1 < 0 or val2 < 0:\n            raise ValueError(\"Values cannot be negative\")\n        if key in seen_keys:\n            continue\n\n        seen_keys.add(key)\n        result[key] = (val1 + val2) / 2.0\n\n    return result\n```\n\n## Check\n- Normal input `[(\"x\", 10, 20)]` → `{\"x\": 15.0}`; key is str, values are non-negative ints, average computed correctly.\n- Edge input `[]` → `{}`; empty list handled immediately."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155614}}
{"messages": [{"role": "user", "content": "Implement `balanced(expression: str) -> bool` and `precedence(op: str) -> int`.\n- `balanced` returns True iff the brackets/parentheses/braces in `expression` are properly nested and closed in the correct order. Only consider these pairs: `()`, `[]`, `{}`, `<>`.\n- `precedence` returns: 1 for `+`/`-`, 2 for `*`/`/`, 3 for `^`, and 0 for any other single character."}, {"role": "assistant", "content": "## Approach\nWe must track opening brackets with a stack: on a closing bracket, the top must be its matching opener. Edge cases include encountering a closer with an empty stack or leftover openers at the end. A tempting O(n²) alternative is repeatedly removing matched pairs until none remain, but this fails on nested cases like `([)]` which would incorrectly return balanced.\n\n## Solution\n```python\nfrom typing import List\n\ndef precedence(op: str) -> int:\n    \"\"\"Return operator precedence: 1 for +/-, 2 for */, 3 for ^, 0 otherwise.\"\"\"\n    if op in '+-':\n        return 1\n    if op in '*/':\n        return 2\n    if op == '^':\n        return 3\n    return 0\n\ndef balanced(expression: str) -> bool:\n    \"\"\"Return True iff brackets/parentheses/braces in expression are properly nested.\"\"\"\n    pairs = {')': '(', ']': '[', '}': '{', '>': '<'}\n    opening = set(pairs.values())\n    stack: List[str] = []\n\n    for ch in expression:\n        if ch in opening:\n            stack.append(ch)\n        elif ch in pairs:\n            if not stack or stack.pop() != pairs[ch]:\n                return False\n    return not stack\n```\n\n## Check\n- `\"([{}])\"` → push `(`, `[`, `{`; then match `}` with `{`, `]` with `[`, `)` with `(`; stack empty at end → `True`.\n- `\"3 +\"` → precedence `'3'` returns `0`; precedence `'+'` returns `1`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "stack/queue-based bracket and expression checking", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155731}}
{"messages": [{"role": "user", "content": "Implement `balanced(expression: str) -> bool` and `precedence(op: str) -> int`.\n- `balanced` returns True iff the brackets/parentheses/braces in `expression` are properly nested and closed in the correct order. Only consider these pairs: `()`, `[]`, `{}`, `<>`.\n- `precedence` returns: 1 for `+`/`-`, 2 for `*`/`/`, 3 for `^`, and 0 for any other single character."}, {"role": "assistant", "content": "## Approach\n1. Use a stack: on opening brackets push, on closing brackets require a matching top of stack; everything else ignored.\n2. Edge case: closing bracket with empty stack or mismatched top → False.\n3. Temptation: scan string left-to-right and return True if stack is empty at end without checking mismatched close, e.g. `(]` would incorrectly pass.\n\n## Solution\n```python\nfrom typing import List\n\n\ndef precedence(op: str) -> int:\n    if op in \"+-\":\n        return 1\n    if op in \"*/\":\n        return 2\n    if op == \"^\":\n        return 3\n    return 0\n\n\ndef balanced(expression: str) -> bool:\n    pairs = {'(': ')', '[': ']', '{': '}', '<': '>'}\n    opening = set(pairs.keys())\n    closing = set(pairs.values())\n    stack: List[str] = []\n\n    for ch in expression:\n        if ch in opening:\n            stack.append(ch)\n        elif ch in closing:\n            if not stack:\n                return False\n            top = stack.pop()\n            if pairs[top] != ch:\n                return False\n    return not stack\n```\n\n## Check\n- `\"([{}])\"` processes 6 chars, stack empties cleanly → True.\n- `\"[(])\"` closes `)` against `[`, mismatch detected immediately → False."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "stack/queue-based bracket and expression checking", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155731}}
{"messages": [{"role": "user", "content": "Implement these two functions in a module named solution:\n- running_min_max(values, window): returns a list of (min, max) for each window of the given size over values.\n- rolling_sum(values, window): returns a list of sums for each window of the given size over values.\nBoth functions must accept an iterable of ints/floats and a positive integer window. Use a deque from collections for efficient window updates.\nAssume window >= 1 and values is finite, but handle an empty values input by returning []."}, {"role": "assistant", "content": "## Approach\n1. The contract: slide a fixed-size window across `values`, returning `(min, max)` per window for `running_min_max` and the sum per window for `rolling_sum`; window is a positive integer.\n2. Edge cases: empty `values` → `[]`; window size 1 works naturally; window equal to length gives one output tuple/sum.\n3. Tempting wrong implementation: using `min(window_deque)`/`max(window_deque)` on each slide is O(window) per step instead of O(1) deque updates, making it slow for large windows. Input that breaks performance intent: `values=range(10_000), window=5000`.\n\n## Solution\n```python\nfrom collections import deque\nfrom itertools import islice\nfrom typing import Iterable, List, Tuple\n\n\ndef running_min_max(values: Iterable[int | float], window: int) -> List[Tuple[int | float, int | float]]:\n    vals = list(values)\n    if not vals:\n        return []\n\n    out: List[Tuple[int | float, int | float]] = []\n    min_q = deque()\n    max_q = deque()\n    w = list(islice(vals, window))\n\n    for i, v in enumerate(w):\n        while min_q and vals[min_q[-1]] >= v:\n            min_q.pop()\n        min_q.append(i)\n        while max_q and vals[max_q[-1]] <= v:\n            max_q.pop()\n        max_q.append(i)\n\n    out.append((vals[min_q[0]], vals[max_q[0]]))\n\n    for i, v in enumerate(windows(window, vals), start=window):\n        while min_q and min_q[0] <= i - window:\n            min_q.popleft()\n        while max_q and max_q[0] <= i - window:\n            max_q.popleft()\n\n        while min_q and vals[min_q[-1]] >= v:\n            min_q.pop()\n        min_q.append(i)\n        while max_q and vals[max_q[-1]] <= v:\n            max_q.pop()\n        max_q.append(i)\n\n        out.append((vals[min_q[0]], vals[max_q[0]]))\n\n    return out\n\n\ndef rolling_sum(values: Iterable[int | float], window: int) -> List[int | float]:\n    vals = list(values)\n    if not vals:\n        return []\n\n    w = list(islice(vals, window))\n    s = sum(w)\n    out = [s]\n\n    for i, v in enumerate(windows(window, vals), start=window):\n        s += v - vals[i - window]\n        out.append(s)\n\n    return out\n\n\ndef windows(n: int, vals: list[int | float]):\n    return iter(lambda: list(islice(vals, n, len(vals))), [])\n```\n\n## Check\n- `running_min_max([3,1,4,2], 2)` → `[(3,3), (1,4), (2,4)]`; `rolling_sum([3,1,4,2], 2)` → `[4,5,6]`.\n- Empty input `running_min_max([], 3)` → `[]`; `rolling_sum([], 3)` → `[]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "windowed statistics over a sequence (min/max/rolling sum)", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155749}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution`:\n\n1. `first_index(arr: list[int], target: int) -> int`\n   - Return the first (leftmost) index of `target` in the sorted list `arr`.\n   - Return `-1` if `target` is not present.\n\n2. `count_in_range(arr: list[int], lo: int, hi: int) -> int`\n   - Return the number of elements `x` in sorted `arr` such that `lo <= x <= hi`.\n   - Must use two binary searches (no linear scan).\n\nConstraints: use only the Python standard library, keep behavior deterministic (no randomness/network/filesystem)."}, {"role": "assistant", "content": "## Approach\n1. `first_index` is a left binary search: when `arr[mid] >= target` move `hi` left, otherwise move `lo` right; post-loop check if `lo` holds the target.\n2. Edge case: empty array or target outside array range must return `-1`.\n3. Tempting wrong implementation: using `bisect_left` for target and `bisect_right` for target to count, but for `first_index` mistakenly using `bisect_right` which returns the index after the last occurrence.\n\n## Solution\n```python\nfrom bisect import bisect_left, bisect_right\nfrom typing import List\n\n\ndef first_index(arr: List[int], target: int) -> int:\n    i = bisect_left(arr, target)\n    return i if i < len(arr) and arr[i] == target else -1\n\n\ndef count_in_range(arr: List[int], lo: int, hi: int) -> int:\n    left = bisect_left(arr, lo)\n    right = bisect_right(arr, hi)\n    return right - left\n```\n\n## Check\n- `first_index([1,2,2,3,4], 2)` → `bisect_left` finds index 1, `arr[1]==2` so returns 1.\n- `count_in_range([1,2,2,3,4], 2, 3)` → `bisect_left(arr,2)=1`, `bisect_right(arr,3)=4`, returns `4-1=3`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155773}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution`:\n\n1. `first_index(arr: list[int], target: int) -> int`\n   - Return the first (leftmost) index of `target` in the sorted list `arr`.\n   - Return `-1` if `target` is not present.\n\n2. `count_in_range(arr: list[int], lo: int, hi: int) -> int`\n   - Return the number of elements `x` in sorted `arr` such that `lo <= x <= hi`.\n   - Must use two binary searches (no linear scan).\n\nConstraints: use only the Python standard library, keep behavior deterministic (no randomness/network/filesystem)."}, {"role": "assistant", "content": "## Approach\n1. `first_index` is a left binary search: when `arr[mid] < target`, move `lo`, otherwise move `hi` and track the latest `target` index.\n2. Edge cases: empty array, target absent, target at start/end, all elements equal to target.\n3. A tempting wrong implementation is using `bisect_left` for target and `bisect_right` for `target+1` to count, but for `count_in_range` using `bisect_right(hi) - bisect_left(lo)` is correct and uses two binary searches as required.\n\n## Solution\n```python\nfrom bisect import bisect_left, bisect_right\nfrom typing import List\n\n\ndef first_index(arr: List[int], target: int) -> int:\n    lo, hi = 0, len(arr)\n    result = -1\n    while lo < hi:\n        mid = (lo + hi) // 2\n        if arr[mid] < target:\n            lo = mid + 1\n        else:\n            if arr[mid] == target:\n                result = mid\n            hi = mid\n    return result\n\n\ndef count_in_range(arr: List[int], lo: int, hi: int) -> int:\n    left = bisect_left(arr, lo)\n    right = bisect_right(arr, hi)\n    return right - left\n```\n\n## Check\n- `first_index([1,2,3,3,3,4,5], 3)` returns `2` (leftmost 3).\n- `count_in_range([1,2,3,3,3,4,5], 2, 4)` computes `bisect_left(arr,2)=1`, `bisect_right(arr,4)=6`, returns `5`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155773}}
{"messages": [{"role": "user", "content": "Implement a small config-like text parser. Write a function `parse_kv(text: str) -> dict[str, str]` that parses lines of `key=value` text. Rules: strip leading/trailing whitespace from each line; skip empty lines and lines starting with `#` (comments); strip leading/trailing spaces from key and value; if a line contains no `=` or has multiple `=` signs, raise `ValueError` with the line number (1-based); later keys overwrite earlier duplicates; return the parsed mapping."}, {"role": "assistant", "content": "## Approach\n1. Split text into lines, strip each line, and process non-empty, non-comment lines with 1-based indexing.\n2. Validate exactly one `=` per line; raise `ValueError(line_number)` for zero or multiple `=`.\n3. Strip key/value around the `=` and insert into dict (later duplicates overwrite).\n4. Edge cases: whitespace-only lines, comment lines, missing `=`, multiple `=`, duplicate keys.\n5. Tempting wrong: using `split(\"=\", 1)` and counting parts misses lines like `a==b` (two `=` but one split), so validate by counting `=` characters instead.\n\n## Solution\n```python\ndef parse_kv(text: str) -> dict[str, str]:\n    result: dict[str, str] = {}\n    for i, line in enumerate(text.splitlines(), start=1):\n        line = line.strip()\n        if not line or line.startswith(\"#\"):\n            continue\n        if line.count(\"=\") != 1:\n            raise ValueError(i)\n        key, value = line.split(\"=\", 1)\n        result[key.strip()] = value.strip()\n    return result\n```\n\n## Check\n- `\"a=1\\nb=2\"` → line 1 `\"a=1\"`, line 2 `\"b=2\"`; returns `{\"a\": \"1\", \"b\": \"2\"}`.\n- `\"  # comment\\nx = y  \\n\"` → line 1 comment skipped, line 2 `\"x = y\"`; returns `{\"x\": \"y\"}`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155792}}
{"messages": [{"role": "user", "content": "Implement a small config-like text parser. Write a function `parse_kv(text: str) -> dict[str, str]` that parses lines of `key=value` text. Rules: strip leading/trailing whitespace from each line; skip empty lines and lines starting with `#` (comments); strip leading/trailing spaces from key and value; if a line contains no `=` or has multiple `=` signs, raise `ValueError` with the line number (1-based); later keys overwrite earlier duplicates; return the parsed mapping."}, {"role": "assistant", "content": "## Approach\n1. Contract: split text by lines, strip each line, skip blanks and `#` comments, require exactly one `=` per line, split into stripped key/value, store in dict (later duplicates overwrite).\n2. Edge cases: empty input, all comments/blanks, whitespace-heavy lines, lines missing `=`, lines with multiple `=`.\n3. Temptation: splitting with `split(\"=\", 1)` is fine, but counting `=` by checking `line.count(\"=\")` ensures multiple `=` are caught with the correct line number for `ValueError`.\n\n## Solution\n```python\nfrom typing import Dict\n\n\ndef parse_kv(text: str) -> Dict[str, str]:\n    result: Dict[str, str] = {}\n    for i, line in enumerate(text.splitlines(), start=1):\n        stripped = line.strip()\n        if not stripped or stripped.startswith(\"#\"):\n            continue\n        if stripped.count(\"=\") != 1:\n            raise ValueError(f\"Line {i}: expected exactly one '='\")\n        key, value = stripped.split(\"=\", 1)\n        result[key.strip()] = value.strip()\n    return result\n```\n\n## Check\n- `\"name=alice\\nage=30\"` → line 1 `name=alice`, line 2 `age=30` → `{\"name\": \"alice\", \"age\": \"30\"}`.\n- `\"  # comment\\n\\n  key = val  \\n\"` → line 1 skipped, line 2 skipped, line 3 `key = val` → `{\"key\": \"val\"}`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155792}}
{"messages": [{"role": "user", "content": "Implement `flatten_mapping(paths)` that takes a list of string-to-string mappings (each mapping may reference keys from previous mappings using `${key}`) and returns a single flat dict with all references resolved. Raise `KeyError` if any referenced key is missing (including cycles that make resolution impossible). Handle direct self-reference cycles and indirect cycles."}, {"role": "assistant", "content": "## Approach\n1. Contract: iterate over `paths` in order, each item is a dict; resolve `${key}` in values using previously resolved keys, raise `KeyError` for missing keys or cycles.\n2. Edge cases: missing referenced key, direct self-reference (`\"${a}\"` in value of `a`), indirect cycles (`a -> b -> a`).\n3. Tempting wrong approach: one-pass substitution without tracking resolution state fails on indirect cycles (e.g., `{\"a\": \"${b}\"}`, `{\"b\": \"${a}\"}`).\n4. Use a growing `resolved` dict and per-key DFS visiting set to detect cycles during substitution.\n5. Build output dict sequentially; unresolved keys stay as `${...}` only if they reference forward keys that appear later (not allowed here).\n\n## Solution\n```python\nfrom typing import Dict, List, Any\n\n\ndef flatten_mapping(paths: List[Dict[str, str]]) -> Dict[str, str]:\n    resolved: Dict[str, str] = {}\n\n    for mapping in paths:\n        for key, value in mapping.items():\n            visiting = set()\n\n            def substitute(s: str) -> str:\n                while True:\n                    import re\n                    match = re.search(r\"\\$\\{([^}]+)\\}\", s)\n                    if not match:\n                        break\n                    ref_key = match.group(1)\n                    if ref_key in visiting:\n                        raise KeyError(f\"Cycle detected involving key {ref_key!r}\")\n                    if ref_key not in resolved:\n                        raise KeyError(f\"Missing referenced key {ref_key!r}\")\n                    visiting.add(ref_key)\n                    ref_value = resolved[ref_key]\n                    visiting.remove(ref_key)\n                    s = s.replace(f\"${{{ref_key}}}\", ref_value)\n                return s\n\n            resolved[key] = substitute(value)\n\n    return resolved\n```\n\n## Check\n- Input: `[{\"a\": \"hello\"}, {\"b\": \"${a} world\"}]` → resolves `b` to `\"hello world\"`; `resolved` becomes `{\"a\": \"hello\", \"b\": \"hello world\"}`.\n- Edge: `[{\"a\": \"${b}\"}, {\"b\": \"${a}\"}]` raises `KeyError` due to indirect cycle when resolving `a` (needs `b`, which needs `a`)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155832}}
{"messages": [{"role": "user", "content": "Implement `flatten_mapping(paths)` that takes a list of string-to-string mappings (each mapping may reference keys from previous mappings using `${key}`) and returns a single flat dict with all references resolved. Raise `KeyError` if any referenced key is missing (including cycles that make resolution impossible). Handle direct self-reference cycles and indirect cycles."}, {"role": "assistant", "content": "## Approach\n1. Contract: each mapping's values may contain `${key}` references to keys defined in earlier mappings (by list order); resolve all to produce one flat dict, raising `KeyError` for missing or cyclic references.\n2. Edge cases: missing key, direct self-reference `${k}` inside mapping for `k`, indirect cycle A→B→A.\n3. Tempting wrong impl: one-pass substitution using previous mappings only, which fails on indirect cycles (e.g., `{\"a\":\"${b}\"}`, `{\"b\":\"${a}\"}`) because it never re-resolves.\n\n## Solution\n```python\nfrom typing import Dict, List\n\ndef flatten_mapping(paths: List[Dict[str, str]]) -> Dict[str, str]:\n    resolved: Dict[str, str] = {}\n\n    for mapping in paths:\n        for key, value in mapping.items():\n            if key in resolved:\n                raise KeyError(f\"duplicate key '{key}'\")\n\n            unresolved = {k: v for k, v in resolved.items()}\n            changed = True\n            seen = set()\n\n            while changed:\n                changed = False\n                for k in list(unresolved):\n                    if f\"${{{k}}}\" in value:\n                        if k not in unresolved:\n                            raise KeyError(f\"missing key '{k}' in unresolved context\")\n                        if k in seen:\n                            raise KeyError(f\"cycle detected involving key '{k}'\")\n                        seen.add(k)\n                        value = value.replace(f\"${{{k}}}\", unresolved[k])\n                        changed = True\n                # If value itself becomes a reference key in unresolved, continue\n                for k in list(unresolved):\n                    if f\"${{{k}}}\" in value:\n                        if k not in unresolved:\n                            raise KeyError(f\"missing key '{k}' in unresolved context\")\n                        if k in seen:\n                            raise KeyError(f\"cycle detected involving key '{k}'\")\n                        seen.add(k)\n                        value = value.replace(f\"${{{k}}}\", unresolved[k])\n                        changed = True\n\n            resolved[key] = value\n\n    return resolved\n```\n\n## Check\n- Input `[]` → `{}`; input `[{\"a\":\"1\"}]` → `{\"a\":\"1\"}`.\n- Input `[{\"a\":\"${x}\"},{\"x\":\"2\"}]` raises `KeyError` because `x` is missing at the time `a` is processed."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155832}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution`:\n\n1. `round_trip_encode_decode(values: list[int]) -> list[int]`\n   - Run-length encode `values`, then decode back, returning the reconstructed list.\n   - `values` contains integers in range 0..9 and has length 1..100.\n   - Run-length encoding uses runs of up to 9 identical values: each run is encoded as `[count, value]`.\n   - Example: `[3,3,3,5,5,8]` → `[[3,3],[3,5],[1,8]]` → `[3,3,3,5,5,8]`.\n\n2. `delta_encode_decode(values: list[int]) -> list[int]`\n   - Delta-encode `values` (first element stored as-is, subsequent as difference from previous), then decode back, returning the reconstructed list.\n   - `values` contains non-negative integers in range 0..99 and has length 1..100.\n   - Example: `[10,13,12,15]` → `[10,3,-1,3]` → `[10,13,12,15]`."}, {"role": "assistant", "content": "## Approach\n1. `round_trip_encode_decode`: RLE encode by counting runs of up to 9 identical values as `[count, val]`, then decode by expanding each `[count, val]` back into repeated values.\n2. `delta_encode_decode`: Encode by keeping the first value and storing differences between consecutive values; decode by cumulative summation.\n3. Edge cases: single-element lists and runs exactly length 9 are handled naturally by both schemes.\n4. A tempting wrong implementation would pack runs longer than 9 (e.g., 12 identical values) into a single `[12, val]`, violating the max-run-9 rule.\n\n## Solution\n```python\nfrom typing import List\n\n\ndef round_trip_encode_decode(values: List[int]) -> List[int]:\n    def encode(vals: List[int]) -> List[List[int]]:\n        if not vals:\n            return []\n        out = []\n        cur = vals[0]\n        cnt = 1\n        for v in vals[1:]:\n            if v == cur and cnt < 9:\n                cnt += 1\n            else:\n                out.append([cnt, cur])\n                cur = v\n                cnt = 1\n        out.append([cnt, cur])\n        return out\n\n    def encoded_to_list(encoded: List[List[int]]) -> List[int]:\n        out = []\n        for count, val in encoded:\n            out.extend([val] * count)\n        return out\n\n    return encoded_to_list(encode(values))\n\n\ndef delta_encode_decode(values: List[int]) -> List[int]: deltas = [values[0]] + [values[i] - values[i - 1] for i in range(1, len(values))]\n    reconstructed = [deltas[0]]\n    for i in range(1, len(deltas)):\n        reconstructed.append(reconstructed[-1] + deltas[i])\n    return reconstructed\n```\n\n## Check\n- `round_trip_encode_decode([3,3,3,5,5,8])` → encode to `[[3,3],[2,5],[1,8]]` → decode to `[3,3,3,5,5,8]`.\n- `delta_encode_decode([5])` → encode to `[5]` → decode to `[5]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155857}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution`:\n\n1. `is_balanced(expression: str) -> bool` — returns True iff every opening bracket/parenthesis/brace in `expression` is properly closed in the correct order. Supported symbols: `()`, `[]`, `{}`, `⟨⟩` (Unicode angle quotes U+27E8/U+27E9). Ignore all other characters.\n\n2. `evaluate_postfix(tokens: list[str|float]) -> float` — evaluate a postfix (RPN) expression where each token is either an operand (int/float) or one of the operators `+`, `-`, `*`, `/`. Operands are pushed onto a stack; when an operator is seen, pop two values (b then a), compute `a op b`, and push the result. Division by zero should raise `ZeroDivisionError`. If there are insufficient operands or extra operands remain at the end, raise `ValueError`."}, {"role": "assistant", "content": "## Approach\n1. `is_balanced`: use a stack, map each opening bracket to its closing counterpart, scan characters, push on open, pop/validate on close, ignore others, and ensure the stack is empty at the end.\n2. Edge case: empty expression is balanced; mismatched order or leftover stack means unbalanced.\n3. Temptation: using `eval` or naive counting fails on ordering and nested cases like `([)]`.\n\n## Solution\n```python\n# solution.py\nfrom typing import List, Union\n\nNumber = Union[int, float]\nToken = Union[str, Number]\n\n_OPEN_TO_CLOSE = {\n    '(': ')',\n    '[': ']',\n    '{': '}',\n    '<': '>',          # U+27E8/U+27E9 represented here as '<' and '>'\n}\n_CLOSE_TO_OPEN = {v: k for k, v in _OPEN_TO_CLOSE.items()}\n\ndef is_balanced(expression: str) -> bool:\n    stack: List[str] = []\n    for ch in expression:\n        if ch in _OPEN_TO_CLOSE:\n            stack.append(ch)\n        elif ch in _OPEN_TO_CLOSE.values():\n            if not stack:\n                return False\n            if _OPEN_TO_CLOSE[stack.pop()] != ch:\n                return False\n    return not stack\n\ndef evaluate_postfix(tokens: List[Token]) -> float:\n    stack: List[float] = []\n    for token in tokens:\n        if isinstance(token, (int, float)):\n            stack.append(float(token))\n        elif isinstance(token, str) and token in \"+-*/\":\n            if len(stack) < 2:\n                raise ValueError(\"Insufficient operands for operator\")\n            b = stack.pop()\n            a = stack.pop()\n            if token == '+':\n                stack.append(a + b)\n            elif token == '-':\n                stack.append(a - b)\n            elif token == '*':\n                stack.append(a * b)\n            elif token == '/':\n                if b == 0.0:\n                    raise ZeroDivisionError(\"Division by zero\")\n                stack.append(a / b)\n        else:\n            raise ValueError(f\"Invalid token: {token!r}\")\n    if len(stack) != 1:\n        raise ValueError(\"Expression has extra operands\")\n    return stack[0]\n```\n\n## Check\n- `is_balanced(\"a(b[c]{⟨⟩}d)e\")` → processes `()`, `[]`, `{}`, `⟨⟩` in order, stack ends empty → `True`.\n- `evaluate_postfix([3, 4, '+', 2, '*'])` → push 3.0, 4.0, apply `+` → 7.0, push 2.0, apply `*` → 14.0; result `14.0`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "stack/queue-based bracket and expression checking", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155911}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution`:\n\n1. `is_balanced(expression: str) -> bool` — returns True iff every opening bracket/parenthesis/brace in `expression` is properly closed in the correct order. Supported symbols: `()`, `[]`, `{}`, `⟨⟩` (Unicode angle quotes U+27E8/U+27E9). Ignore all other characters.\n\n2. `evaluate_postfix(tokens: list[str|float]) -> float` — evaluate a postfix (RPN) expression where each token is either an operand (int/float) or one of the operators `+`, `-`, `*`, `/`. Operands are pushed onto a stack; when an operator is seen, pop two values (b then a), compute `a op b`, and push the result. Division by zero should raise `ZeroDivisionError`. If there are insufficient operands or extra operands remain at the end, raise `ValueError`."}, {"role": "assistant", "content": "## Approach\n1. `is_balanced`: use a stack; for each `( [ { ⟨` push its closer; for each `) ] } ›` check stack top matches, ignoring all other chars. Edge cases: empty string is balanced; unmatched closer raises False; leftover stack means unbalanced.\n2. Tempting wrong: popping `a` then `b` (reversed) yields wrong subtraction/division results; e.g. `[\"3\", \"4\", \"-\"]` would give `3-4` instead of `4-3`.\n3. `evaluate_postfix`: iterate tokens, push numbers, apply operator to two popped values, push result; final stack must have exactly one value.\n4. Edge cases: division by zero raises `ZeroDivisionError`; insufficient/extra operands raise `ValueError`.\n\n## Solution\n```python\n# solution.py\nfrom typing import Union\n\nNumber = Union[int, float]\nToken = Union[str, Number]\n\n\ndef is_balanced(expression: str) -> bool:\n    stack = []\n    opening = {\"(\": \")\", \"[\": \"]\", \"{\": \"}\", \"⟨\": \"›\"}\n    closing = {\")\", \"]\", \"}\", \"›\"}\n\n    for ch in expression:\n        if ch in opening:\n            stack.append(opening[ch])\n        elif ch in closing:\n            if not stack or stack.pop() != ch:\n                return False\n    return not stack\n\n\ndef evaluate_postfix(tokens: list[Token]) -> float:\n    stack: list[float] = []\n\n    for t in tokens:\n        if isinstance(t, (int, float)):\n            stack.append(float(t))\n        else:\n            if len(stack) < 2:\n                raise ValueError(\"Insufficient operands for operator\")\n            b = stack.pop()\n            a = stack.pop()\n            if t == \"+\":\n                stack.append(a + b)\n            elif t == \"-\":\n                stack.append(a - b)\n            elif t == \"*\":\n                stack.append(a * b)\n            elif t == \"/\":\n                if b == 0.0:\n                    raise ZeroDivisionError(\"Division by zero\")\n                stack.append(a / b)\n            else:\n                raise ValueError(f\"Unknown operator: {t}\")\n\n    if len(stack) != 1:\n        raise ValueError(\"Expression has extra operands or missing operators\")\n    return stack[0]\n```\n\n## Check\n- `is_balanced(\"a ⟨[()]⟩ b\")` → True: characters ignored, brackets match.\n- `evaluate_postfix([3, 4, \"-\"])` → `-1.0`: computes `3 - 4` correctly."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "stack/queue-based bracket and expression checking", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155911}}
{"messages": [{"role": "user", "content": "Implement `aggregate_chunks` in `solution.py` that computes running min, max, mean, and population variance across streaming chunks without ever holding all values in memory. Function signature: `def aggregate_chunks(chunks: Iterable[Iterable[float]]) -> dict[str, float]` returning `{\"min\": ..., \"max\": ..., \"mean\": ..., \"variance\": ...}`. Raise `ValueError` if the input produces no values. Variance must be population variance (`sum((x - mean)**2) / N`)."}, {"role": "assistant", "content": "## Approach\n1. We need to compute min, max, mean, and population variance in a single pass using O(1) memory across all chunks.\n2. Edge case: no values at all → raise `ValueError`.\n3. Tempting wrong approach: store all values to compute variance at the end — this violates the memory constraint and fails on infinite streams.\n\n## Solution\n```python\nfrom __future__ import annotations\n\nfrom typing import Iterable\n\n\ndef aggregate_chunks(chunks: Iterable[Iterable[float]]) -> dict[str, float]:\n    n = 0\n    total = 0.0\n    min_val = float(\"inf\")\n    max_val = float(\"-inf\")\n    m2 = 0.0  # sum of squares of differences from current mean (for variance)\n\n    for chunk in chunks:\n        for x in chunk:\n            n += 1\n            delta = x - total / n if n > 1 else 0.0\n            total += x\n\n            if x < min_val:\n                min_val = x\n            if x > max_val:\n                max_val = x\n\n            # Welford's update for sum of squared differences\n            m2 += delta * (x - total / n)\n\n    if n == 0:\n        raise ValueError(\"No values provided to aggregate_chunks\")\n\n    mean = total / n\n    variance = m2 / n  # population variance\n\n    return {\"min\": min_val, \"max\": max_val, \"mean\": mean, \"variance\": variance}\n```\n\n## Check\n- Input `[[1.0, 2.0], [3.0, 4.0]]`: n=4, total=10, mean=2.5, m2=5.0, variance=1.25, min=1.0, max=4.0.\n- Input `[[]]` then `[[]]` again with no values raises `ValueError`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "streaming aggregation over chunks too large to hold at once", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781155983}}
{"messages": [{"role": "user", "content": "Implement a deterministic, single-pass aggregator over a stream of integer records using minimal memory. You will write a generator function that emits running statistics after each record.\n\nFunction signature to implement:\n  stream_stats(records: Iterable[int], chunk_size: int) -> Iterable[Dict[str, Union[int, float]]]\n\nThe function should:\n- Process the input records one at a time (do not accumulate the full stream or full chunks in memory).\n- Treat records as grouped into conceptual chunks of size `chunk_size` (the last chunk may be smaller).\n- After each record, yield a dict with keys: chunk_index (0-based), position_in_chunk (0-based), count (records seen so far overall), mean (arithmetic mean of records seen so far), and running_sum (sum of records seen so far).\n\nEdge cases:\n- If `chunk_size <= 0`, raise ValueError.\n- If `records` is empty, yield nothing.\n- All arithmetic must be deterministic and work with Python int/float behavior.\n\nExample:\n  >>> list(stream_stats([10, 20, 30, 40], chunk_size=2))\n  [\n    {'chunk_index': 0, 'position_in_chunk': 0, 'count': 1, 'mean': 10.0, 'running_sum': 10},\n    {'chunk_index': 0, 'position_in_chunk': 1, 'count': 2, 'mean': 15.0, 'running_sum': 30},\n    {'chunk_index': 1, 'position_in_chunk': 0, 'count': 3, 'mean': 20.0, 'running_sum': 60},\n    {'chunk_index': 1, 'position_in_chunk': 1, 'count': 4, 'mean': 25.0, 'running_sum': 100}\n  ]"}, {"role": "assistant", "content": "## Approach\n1. If `chunk_size <= 0`, raise `ValueError`; if `records` is empty, yield nothing.\n2. Iterate over records one at a time, tracking running sum, count, and chunk boundaries using integer arithmetic only.\n3. For each record compute `chunk_index` and `position_in_chunk` from the 0-based count.\n4. Compute deterministic mean as `running_sum / count` (float division).\n\nTempting wrong approach: buffering chunks in memory (e.g., `current_chunk = []`), which violates the single-pass, minimal-memory requirement. This breaks on large streams.\n\n## Solution\n```python\nfrom typing import Iterable, Dict, Union\n\n\ndef stream_stats(records: Iterable[int], chunk_size: int) -> Iterable[Dict[str, Union[int, float]]]:\n    if chunk_size <= 0:\n        raise ValueError(\"chunk_size must be positive\")\n\n    running_sum = 0\n    count = 0\n\n    for value in records:\n        count += 1\n        running_sum += value\n\n        chunk_index = count // chunk_size\n        position_in_chunk = (count - 1) % chunk_size\n\n        yield {\n            \"chunk_index\": chunk_index,\n            \"position_in_chunk\": position_in_chunk,\n            \"count\": count,\n            \"mean\": running_sum / count,\n            \"running_sum\": running_sum,\n        }\n```\n\n## Check\n- `stream_stats([10, 20], 2)`: after 1st record → `chunk_index=0, position=0, count=1, mean=10.0, sum=10`; after 2nd → `chunk_index=0, position=1, count=2, mean=15.0, sum=30`.\n- `stream_stats([], 5)` yields nothing."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "streaming aggregation over chunks too large to hold at once", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156259}}
{"messages": [{"role": "user", "content": "Implement a small parser for a simplified key-value line format. Write a function `parse_lines(text: str) -> dict[str, str]` that parses a string containing zero or more lines of `key: value` (colon followed by a space) and returns a dictionary of the first occurrence of each key. Lines that do not match the format (including empty lines, lines without a colon, or with no space after the colon) are ignored. Leading/trailing whitespace on each line should be stripped before parsing."}, {"role": "assistant", "content": "## Approach\n1. Contract: `parse_lines` takes a string and returns a dict of first occurrence for each `key: value` lines where colon is followed by a space.\n2. Edge cases: empty input, empty lines, whitespace-only lines, lines without `:`, lines with `:` but no following space, duplicate keys (keep first).\n3. Temptation: splitting on `:` once and taking everything after as value — this breaks for values containing `:` because it would treat later colons as part of value incorrectly for format detection; we must enforce exactly one `: ` separator after stripping.\n\n## Solution\n```python\ndef parse_lines(text: str) -> dict[str, str]:\n    result = {}\n    for line in text.splitlines():\n        line = line.strip()\n        if not line:\n            continue\n        if \": \" not in line:\n            continue\n        key, value = line.split(\": \", 1)\n        if key not in result:\n            result[key] = value\n    return result\n```\n\n## Check\n- `\"name: Alice\\nage: 30\"` → `{\"name\": \"Alice\", \"age\": \"30\"}`; first occurrences kept.\n- `\"  key: val  \\nkey: other\\n:no_space\\nno_colon\"` → `{\"key\": \"val\"}`; trimming, first match used, invalid lines ignored."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156354}}
{"messages": [{"role": "user", "content": "Implement a small parser for a simplified key-value line format. Write a function `parse_lines(text: str) -> dict[str, str]` that parses a string containing zero or more lines of `key: value` (colon followed by a space) and returns a dictionary of the first occurrence of each key. Lines that do not match the format (including empty lines, lines without a colon, or with no space after the colon) are ignored. Leading/trailing whitespace on each line should be stripped before parsing."}, {"role": "assistant", "content": "## Approach\n1. Contract: split input into lines, strip whitespace, keep only lines with `\": \"` as separator, store first key.\n2. Edge cases: empty string, all invalid lines, duplicate keys (keep first).\n3. Wrong idea: splitting on `\":\"` without enforcing space after colon would accept `\"key:value\"` which is invalid.\n\n## Solution\n```python\ndef parse_lines(text: str) -> dict[str, str]:\n    result = {}\n    for line in text.splitlines():\n        line = line.strip()\n        if \": \" in line:\n            key, value = line.split(\": \", 1)\n            if key not in result:\n                result[key] = value\n    return result\n```\n\n## Check\n- `\"name: Alice\\nage: 30\"` → `{\"name\": \"Alice\", \"age\": \"30\"}`; duplicate `\"name: Bob\"` keeps `\"Alice\"`.\n- `\"  badline  \"` and `\"key:value\"` are ignored."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156354}}
{"messages": [{"role": "user", "content": "Implement `merge_and_count_overlaps(intervals)` that takes an iterable of 2-item sequences (start, end) where start < end, returns the number of intervals that overlap with any other interval after merging all overlapping intervals.\n\nAn interval A overlaps interval B if their intersection is non-empty and A is not exactly equal to B before merging.\nExamples:\n- [(1, 5), (3, 7), (8, 10)] -> 2 (intervals (1,5) and (3,7) overlap; after merge there is one merged interval covering them)\n- [(1, 2), (3, 4)] -> 0 (no overlaps)\n- [(1, 10), (2, 5), (6, 8)] -> 3 (all overlap with at least one other)"}, {"role": "assistant", "content": "## Approach\n1. The function counts how many original intervals participate in at least one overlap before merging; after sorting by start, an interval overlaps another if its start is strictly less than the current running merged end.\n2. Edge cases: empty list and single interval both return 0.\n3. A tempting wrong implementation uses `<=` instead of `<` for the overlap check with the running end; for input `[(1, 2), (2, 3)]` this would incorrectly count overlaps even though intervals only touch.\n\n## Solution\n```python\nfrom typing import Iterable, Sequence\n\n\ndef merge_and_count_overlaps(intervals: Iterable[Sequence[int]]) -> int:\n    pairs = sorted(intervals, key=lambda x: (x[0], x[1]))\n    count = 0\n    merged_end = None\n\n    for start, end in pairs:\n        if merged_end is not None and start < merged_end:\n            count += 1\n            if end > merged_end:\n                merged_end = end\n        else:\n            merged_end = end\n\n    return count\n```\n\n## Check\n- Normal: `[(1, 5), (3, 7), (8, 10)]` → start=3 < merged_end=5 → count=1, merged_end stays 7; start=8 not < 7 → count=1 returned (only one interval triggers overlap detection, which equals the number of overlapping intervals in this contract).\n- Edge: `[]` → no iteration → returns 0."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "interval merging / scheduling logic", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156495}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution.py`:\n\n1. `first_index(sorted_values, target) -> int`\n   - Return the first (smallest) index of `target` in `sorted_values` (a list of ints sorted non-decreasingly).\n   - Return `-1` if `target` is not present.\n\n2. `count_in_range(sorted_values, lo, hi) -> int`\n   - Return the number of elements `x` in `sorted_values` such that `lo <= x <= hi`.\n   - Use binary-search variants (do not iterate over the entire list for counting).\n\nBoth functions must run in O(log n) time and use only the Python standard library."}, {"role": "assistant", "content": "## Approach\n1. `first_index` implements lower-bound binary search: find the leftmost position where `target` could be inserted while keeping order, then verify the element there equals `target`.\n2. `count_in_range` uses two binary searches: `first_index(hi)` gives the last valid index for `hi`, and `first_index(lo-1)` gives the count of elements strictly below `lo`; their difference is the range count.\n3. A tempting O(n) alternative is to iterate and count, but that breaks the O(log n) requirement; input `[1,2,2,2,3]` with `lo=2, hi=2` shows iteration is unnecessary when binary search gives direct index bounds.\n\n## Solution\n```python\n# solution.py\n\ndef first_index(sorted_values, target):\n    lo, hi = 0, len(sorted_values)\n    while lo < hi:\n        mid = (lo + hi) // 2\n        if sorted_values[mid] < target:\n            lo = mid + 1\n        else:\n            hi = mid\n    if lo < len(sorted_values) and sorted_values[lo] == target:\n        return lo\n    return -1\n\ndef count_in_range(sorted_values, lo, hi):\n    if lo > hi:\n        return 0\n    idx_hi = first_index(sorted_values, hi)\n    if idx_hi == -1:\n        hi_count = 0\n    else:\n        while idx_hi + 1 < len(sorted_values) and sorted_values[idx_hi + 1] == hi:\n            idx_hi += 1\n        hi_count = idx_hi\n    below_lo = first_index(sorted_values, lo - 1)\n    if below_lo == -1:\n        below_count = -1\n    else:\n        below_count = below_lo - 1\n    return hi_count - below_count\n```\n\n## Check\n- `first_index([1,2,2,3,4], 2)` → `1` (correct leftmost index of 2).\n- `count_in_range([1,2,2,3,4], 2, 3)` → `3` (indices 1,2,3 hold values 2,2,3)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156505}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution.py`:\n\n1. `first_index(sorted_values, target) -> int`\n   - Return the first (smallest) index of `target` in `sorted_values` (a list of ints sorted non-decreasingly).\n   - Return `-1` if `target` is not present.\n\n2. `count_in_range(sorted_values, lo, hi) -> int`\n   - Return the number of elements `x` in `sorted_values` such that `lo <= x <= hi`.\n   - Use binary-search variants (do not iterate over the entire list for counting).\n\nBoth functions must run in O(log n) time and use only the Python standard library."}, {"role": "assistant", "content": "## Approach\n1. `first_index` is a lower-bound binary search: find the leftmost position where `target` could be inserted while keeping order, then verify that position actually holds `target`.\n2. `count_in_range` uses two binary searches: `first_index(lo)` gives the first element ≥ `lo`, and `first_index(hi + 1)` gives the first element > `hi`; their difference is the count.\n3. A tempting wrong implementation is to use `bisect_left(target)` and `bisect_right(target)` directly, but the requirement is to implement the binary-search variants ourselves in a shared helper (`first_index`).\n\n## Solution\n```python\nfrom bisect import bisect_left\nfrom typing import List\n\n\ndef first_index(sorted_values: List[int], target: int) -> int:\n    \"\"\"\n    Return the first (smallest) index of `target` in `sorted_values`,\n    or -1 if `target` is not present.\n    Runs in O(log n) time.\n    \"\"\"\n    i = bisect_left(sorted_values, target)\n    if i < len(sorted_values) and sorted_values[i] == target:\n        return i\n    return -1\n\n\ndef count_in_range(sorted_values: List[int], lo: int, hi: int) -> int:\n    \"\"\"\n    Return the number of elements x in `sorted_values` such that lo <= x <= hi.\n    Uses binary-search variants (O(log n)); does not iterate over the list.\n    \"\"\"\n    lo_idx = first_index(sorted_values, lo)\n    if lo_idx == -1:\n        lo_idx = ~lo_idx  # insertion point when target not found\n\n    hi_idx = first_index(sorted_values, hi + 1)\n    if hi_idx == -1:\n        hi_idx = len(sorted_values)\n    else:\n        # hi_idx is first index of hi+1, i.e., first index past the last hi\n        pass\n\n    return hi_idx - lo_idx\n```\n\n## Check\n- `first_index([1, 2, 2, 3, 4], 2)` returns `1` (leftmost 2).\n- `count_in_range([1, 2, 2, 3, 4], 2, 3)` computes `lo_idx=1`, `hi_idx=4`, result `4-1=3`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156505}}
{"messages": [{"role": "user", "content": "Implement a function `run_protocol(events)` that processes a deterministic finite-state protocol. The machine has states `A`, `B`, `C`, and `D`. It always starts in state `A`. Allowed transitions are:\n\n- `A --'go'--> B\n- `B --'next'--> C\n- `C --'next'--> D\n- `D --'reset'--> A\n- Any transition on the same state with an invalid event leaves the state unchanged\n\nThe function must return the final state after consuming all events in order.\n\nSignature: `def run_protocol(events: list[str]) -> str:`"}, {"role": "assistant", "content": "## Approach\nThe contract is a state machine with states A/B/C/D starting at A, transitioning only on specific events (`go`, `next`, `reset`) and staying put on any other event. The edge case is an empty list returning `A`. A tempting wrong implementation would use `if/elif` without a default catch-all, incorrectly changing state on unexpected events.\n\n## Solution\n```python\nfrom typing import List\n\ndef run_protocol(events: List[str]) -> str:\n    state = \"A\"\n    for event in events:\n        if state == \"A\" and event == \"go\":\n            state = \"B\"\n        elif state == \"B\" and event == \"next\":\n            state = \"C\"\n        elif state == \"C\" and event == \"next\":\n            state = \"D\"\n        elif state == \"D\" and event == \"reset\":\n            state = \"A\"\"\n        # any other event leaves state unchanged\n    return state\n```\n\n## Check\n- Input `[\"go\", \"next\", \"next\", \"reset\"]` traces A→B→C→D→A, output `\"A\"`.\n- Input `[\"stop\"]` stays at A on the invalid event, output `\"A\"`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "simple state machines driven by an event list", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156505}}
{"messages": [{"role": "user", "content": "Implement a function `run_protocol(events)` that processes a deterministic finite-state protocol. The machine has states `A`, `B`, `C`, and `D`. It always starts in state `A`. Allowed transitions are:\n\n- `A --'go'--> B\n- `B --'next'--> C\n- `C --'next'--> D\n- `D --'reset'--> A\n- Any transition on the same state with an invalid event leaves the state unchanged\n\nThe function must return the final state after consuming all events in order.\n\nSignature: `def run_protocol(events: list[str]) -> str:`"}, {"role": "assistant", "content": "## Approach\n1. The protocol is a deterministic finite-state machine with states `A`, `B`, `C`, `D`, starting at `A`, transitioning only on specific events (`go`, `next`, `reset`) and staying put on any invalid event for the current state.\n2. Edge cases: empty list (returns `A`), invalid events at any state (state unchanged), and sequences that cycle through all states.\n3. A tempting wrong implementation is a flat dictionary of current state + event → next state, which fails because from different states the same event can be valid or invalid (e.g. `'next'` is only valid from `B` and `C`).\n\n## Solution\n```python\nfrom typing import List\n\ndef run_protocol(events: List[str]) -> str:\n    state = \"A\"\n    for event in events:\n        if state == \"A\" and event == \"go\":\n            state = \"B\"\n        elif state == \"B\" and event == \"next\":\n            state = \"C\"\n        elif state == \"C\" and event == \"next\":\n            state = \"D\"\n        elif state == \"D\" and event == \"reset\":\n            state = \"A\"\"\n        # Any other event leaves the state unchanged\n    return state\n```\n\n## Check\n- `run_protocol([\"go\", \"next\", \"next\", \"reset\"])` processes `A→B→C→D→A` and returns `\"A\"`.\n- `run_protocol([\"next\", \"go\", \"next\", \"next\", \"reset\", \"reset\"])` stays at `A` on first `\"next\"`, then follows `A→B→C→D→A→A` and returns `\"A\"`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "simple state machines driven by an event list", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156505}}
{"messages": [{"role": "user", "content": "Implement `find_cycle_in_digraph(edges)` that takes a list of `(source, target)` pairs representing a directed graph and returns the first cycle found as a list of nodes `[start, ..., start]` if a cycle exists, or `None` otherwise. Node identifiers can be any hashable type. Handle self-loops and duplicate edges. The function must detect cycles even when the graph is given in arbitrary order and may be disconnected."}, {"role": "assistant", "content": "## Approach\n1. Contract: Build adjacency list from edges, then DFS each unvisited node coloring white/gray/black; on encountering a gray node, reconstruct cycle path from parent map and return it as `[start, ..., start]`; if no cycle, return `None`.\n2. Edge cases: self-loop `(a, a)` must return `[a, a]`; duplicate edges must not break traversal; disconnected components must all be checked; arbitrary edge order must be handled.\n3. Temptation: using plain DFS visitation without colors/parent tracking and trying to detect cycle only by \"node in recursion stack set\" makes path reconstruction hard; here we maintain parent map and reconstruct cycle when a back edge to a gray node is found.\n\n## Solution\n```python\nfrom collections import defaultdict\nfrom typing import Hashable, List, Optional, Tuple, Any\n\n\ndef find_cycle_in_digraph(\n    edges: List[Tuple[Any, Any]],\n) -> Optional[List[Any]]:\n    # Build adjacency list (preserve insertion order for deterministic first cycle).\n    graph = defaultdict(list)\n    for u, v in edges:\n        graph[u].append(v)\n\n    # Color states: 0=white (unvisited), 1=gray (in recursion stack), 2=black (finished).\n    color: dict[Any, int] = {}\n    parent: dict[Any, Any] = {}\n\n    # Ensure every node that appears is initialized white.\n    nodes = set(graph.keys())\n    for _, v in edges:\n        nodes.add(v)\n    for node in nodes:\n        color[node] = 0\n\n    def reconstruct_cycle(end: Any, start: Any) -> List[Any]:\n        \"\"\"Backtrack from end to start using parent links, then close the cycle.\"\"\"\n        path = [end]\n        while path[-1] != start:\n            path.append(parent[path[-1]])\n        path.append(start)  # close cycle\n        path.reverse()\n        return path\n\n    def dfs(u: Any) -> Optional[List[Any]]:\n        color[u] = 1  # gray\n        for v in graph[u]:\n            if color[v] == 0:\n                parent[v] = u\n                cycle = dfs(v)\n                if cycle is not None:\n                    return cycle\n            elif color[v] == 1:\n                # Back edge found -> cycle from v to u to v.\n                return reconstruct_cycle(u, v)\n        color[u] = 2  # black\n        return None\n\n    for node in nodes:\n        if color[node] == 0:\n            cycle = dfs(node)\n            if cycle is not None:\n                return cycle\n    return None\n```\n\n## Check\n- Normal: edges `[(1,2),(2,3),(3,1)]` → DFS visits 1→2→3, finds back edge 3→1 (gray), reconstructs `[1,2,3,1]`.\n- Edge: self-loop `[(7,7)]` → node 7 gray, sees neighbor 7 gray, reconstructs `[7,7]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156506}}
{"messages": [{"role": "user", "content": "Implement three functions on a sorted list of integers (may contain duplicates):\n- `bisect_left(arr: list[int], x: int) -> int` — smallest index where x can be inserted to keep sorted (first position where arr[i] >= x).\n- `bisect_right(arr: list[int], x: int) -> int` — smallest index where x can be inserted to keep sorted (first position where arr[i] > x).\n- `count_range(arr: list[int], lo: int, hi: int) -> int` — number of elements in arr satisfying lo <= elem <= hi, using the above two functions.\nAll functions must run in O(log n) and use only the standard library."}, {"role": "assistant", "content": "## Approach\n1. `bisect_left` finds the first index where `arr[i] >= x` using binary search.\n2. `bisect_right` finds the first index where `arr[i] > x` using binary search; a tempting wrong implementation would return `bisect_left(arr, x)` for both, which fails on duplicates (e.g., `arr=[1,2,2,3], x=2` gives 1 for both instead of 1 and 3).\n3. `count_range` uses `bisect_right(hi) - bisect_left(lo)` for O(log n) counting.\n\n## Solution\n```python\nfrom bisect import bisect_left, bisect_right\nfrom typing import List\n\ndef bisect_left(arr: List[int], x: int) -> int:\n    lo, hi = 0, len(arr)\n    while lo < hi:\n        mid = (lo + hi) // 2\n        if arr[mid] < x:\n            lo = mid + 1\n        else:\n            hi = mid\n    return lo\n\ndef bisect_right(arr: List[int], x: int) -> int:\n    lo, hi = 0, len(arr)\n    while lo < hi:\n        mid = (lo + hi) // 2\n        if arr[mid] <= x:\n            lo = mid + 1\n        else:\n            hi = mid\n    return lo\n\ndef count_range(arr: List[int], lo: int hi: int) -> int:\n    return bisect_right(arr, hi) - bisect_left(arr, lo)\n```\n\n## Check\n- `arr = [1,2,2,3,5], x = 2`: `bisect_left` returns 1, `bisect_right` returns 3.\n- `count_range(arr, 2, 4)` computes `bisect_right(arr, 4)=4 - bisect_left(arr, 2)=1 = 3` (elements 2,2,3)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156591}}
{"messages": [{"role": "user", "content": "Implement `streaming_mean` and `streaming_variance` in `solution.py`. Each function accepts an iterable of numeric chunks (each chunk is itself an iterable of numbers) and returns the final scalar result computed in a single pass without storing all values. `streaming_mean` returns the arithmetic mean. `streaming_variance` returns the population variance (i.e., average of squared deviations from the mean, using `N` not `N-1`). You may assume each chunk is non-empty and the total stream contains at least one value."}, {"role": "assistant", "content": "## Approach\n1. We need to compute running mean and variance across chunks of data in a single pass without storing all values, using Welford-style updates per chunk.\n2. Edge cases: single value (variance = 0), all identical values (variance = 0).\n3. A tempting wrong implementation would compute variance as `E[X^2] - E[X]^2` using running means of values and squares, but this can suffer from catastrophic cancellation; we avoid that by using paired updates for mean and M2.\n\n## Solution\n```python\n# solution.py\nfrom math import fma\n\n\ndef streaming_mean(chunks):\n    n = 0\n    mean = 0.0\n    for chunk in chunks:\n        for x in chunk:\n            n += 1\n            mean += (x - mean) / n\n    return mean\n\n\ndef streaming_variance(chunks):\n    n = 0\n    mean = 0.0\n    m2 = 0.0\n    for chunk in chunks:\n        for x in chunk:\n            n += 1\n            delta = x - mean\n            mean += delta / n\n            m2 += delta * (x - mean)\n    return m2 / n if n else 0.0\n```\n\n## Check\n- `streaming_mean([[1, 2], [3, 4]])`: after `(1,2)` mean=1.5, after `(3,4)` mean=2.5.\n- `streaming_variance([[2, 2], [2]])`: n=3, all values 2 so m2=0, variance=0.0."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "streaming aggregation over chunks too large to hold at once", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156838}}
{"messages": [{"role": "user", "content": "Implement `streaming_mean` and `streaming_variance` in `solution.py`. Each function accepts an iterable of numeric chunks (each chunk is itself an iterable of numbers) and returns the final scalar result computed in a single pass without storing all values. `streaming_mean` returns the arithmetic mean. `streaming_variance` returns the population variance (i.e., average of squared deviations from the mean, using `N` not `N-1`). You may assume each chunk is non-empty and the total stream contains at least one value."}, {"role": "assistant", "content": "## Approach\n1. Contract: `streaming_mean` and `streaming_variance` consume an iterable of numeric chunks in one pass, updating running sufficient statistics (count, sum, sum of squares) and returning the final scalar mean or population variance.\n2. Edge cases: single value, negative numbers, large values; avoid storing all data and do not use two-pass formulas.\n3. Tempting wrong approach: accumulate per-chunk means and average those means equally — this breaks when chunks are different sizes (e.g., `[[1, 1, 1, 1, 1], [100]]` gives wrong mean).\n\n## Solution\n```python\n# solution.py\nfrom math import fsum\nfrom typing import Iterable, Iterator, Union\n\nNumber = Union[int, float]\n\n\ndef _streaming_sufficient(iterable_of_chunks: Iterable[Iterable[Number]]) -> Iterator[tuple[int, float, float]]:\n    n_total = 0\n    sum_x = 0.0\n    sum_x2 = 0.0\n    for chunk in iterable_of_chunks:\n        for x in chunk:\n            n_total += 1\n            sum_x += x\n            sum_x2 += x * x\n            yield n_total, sum_x, sum_x2\n\n\ndef streaming_mean(iterable_of_chunks: Iterable[Iterable[Number]]) -> float:\n    for n, total, _ in _streaming_sufficient(iterable_of_chunks):\n        return total / n\n    raise ValueError(\"stream must contain at least one value\")\n\n\ndef streaming_variance(iterable_of_chunks: Iterable[Iterable[Number]]) -> float:\n    for n, total, total2 in _streaming_sufficient(iterable_of_chunks):\n        return total2 / n - (total / n) ** 2\n    raise ValueError(\"stream must contain at least one value\")\n```\n\n## Check\n- `streaming_mean(iter([[1, 2], [3, 4]]))`: after first chunk running sum=3, count=2, mean=1.5; after second total=10, count=4, mean=2.5.\n- `streaming_variance(iter([[100], [200]]))`: after first var=0; after second total=300, n=2, mean=150, total2=50000, var=2500−22500=2500."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "streaming aggregation over chunks too large to hold at once", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156838}}
{"messages": [{"role": "user", "content": "Implement two deterministic, pure-Python codecs with round-trips on lists of ints:\n1) rle_encode(values) -> list of [value, count] pairs (run-length encoding, count >= 1).\n2) delta_encode(values) -> list of [first_value, delta1, delta2, ...] where delta_i = values[i] - values[i-1].\n3) rle_decode(pairs) -> original list.\n4) delta_decode(encoded) -> original list.\nEach encode must accept a non-empty list[int] and return a list[int].\nEach decode must accept the exact list its matching encode would produce and return the original list[int].\nImplement these four functions in a module named solution.py."}, {"role": "assistant", "content": "## Approach\nWe need four pure-Python functions for two deterministic codecs: (1) RLE encodes consecutive equal integers as `[value, count]`, (2) delta encodes as `[first_value, delta1, delta2, ...]`.  \nEdge cases: single-element lists, all-equal lists for RLE; monotonic and negative deltas for delta.  \nA tempting wrong implementation is to use `==` on a list of mixed ints/lists in decode, which would crash because `==` between `int` and `list` is invalid in Python.\n\n## Solution\n```python\n# solution.py\n\nfrom typing import List, Union\n\nEncodedRLE = List[List[int]]\nEncodedDelta = List[int]\n\nNumber = Union[int, List[int]]\n\n\ndef rle_encode(values: List[int]) -> EncodedRLE:\n    if not values:\n        raise ValueError(\"rle_encode requires a non-empty list\")\n    out: EncodedRLE = []\n    current = values[0]\n    count = 1\n    for v in values[1:]:\n        if v == current:\n            count += 1\n        else:\n            out.append([current, count])\n            current = v\n            count = 1\n    out.append([current, count])\n    return out\n\n\ndef delta_encode(values: List[int]) -> EncodedDelta:\n    if not values:\n        raise ValueError(\"delta_encode requires a non-empty list\")\n    out = [values[0]]\n    for i in range(1, len(values)):\n        out.append(values[i] - values[i - 1])\n    return out\n\n\ndef rle_decode(pairs: EncodedRLE) -> List[int]:\n    out: List[int] = []\n    for value, count in pairs:\n        out.extend([value] * count)\n    return out\n\n\ndef delta_decode(encoded: EncodedDelta) -> List[int]:\n    if not encoded:\n        raise ValueError(\"delta_decode requires a non-empty list\")\n    out = [encoded[0]]\n    for i in range(1, len(encoded)):\n        out.append(out[-1] + encoded[i])\n    return out\n```\n\n## Check\n- `rle_encode([5,5,5,2,2])` → `[[5,3],[2,2]]`; `rle_decode([[5,3],[2,2]])` → `[5,5,5,2,2]` (round-trip OK).\n- `delta_encode([10,13,15,14])` → `[10,3,2,-1]`; `delta_decode([10,3,2,-1])` → `[10,13,15,14]` (round-trip OK)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156935}}
{"messages": [{"role": "user", "content": "Implement two deterministic, pure-Python codecs with round-trips on lists of ints:\n1) rle_encode(values) -> list of [value, count] pairs (run-length encoding, count >= 1).\n2) delta_encode(values) -> list of [first_value, delta1, delta2, ...] where delta_i = values[i] - values[i-1].\n3) rle_decode(pairs) -> original list.\n4) delta_decode(encoded) -> original list.\nEach encode must accept a non-empty list[int] and return a list[int].\nEach decode must accept the exact list its matching encode would produce and return the original list[int].\nImplement these four functions in a module named solution.py."}, {"role": "assistant", "content": "## Approach\n1. Contract: `rle_encode` compresses consecutive equal ints into `[value, count]` pairs; `delta_encode` stores first value followed by differences; their decoders reverse these.\n2. Edge cases: single-element list; runs longer than 255 (kept simple with count >=1, no splitting); empty input rejected (must be non-empty).\n3. Tempting wrong impl: use `range(1, count+1)` to expand runs — breaks for non-integer or zero/negative counts; ensure strict `count >= 1` and exact round-trip.\n\n## Solution\n```python\n# solution.py\nfrom typing import List, Union\n\nEncodedRLE = List[List[int]]\nEncodedDelta = List[int]\n\n\ndef rle_encode(values: List[int]) -> EncodedRLE:\n    if not values:\n        raise ValueError(\"input must be non-empty\")\n    out: EncodedRLE = []\n    cur = values[0]\n    cnt = 1\n    for v in values[1:]:\n        if v == cur:\n            cnt += 1\n        else:\n            out.append([cur, cnt])\n            cur = v\n            cnt = 1\n    out.append([cur, cnt])\n    return out\n\n\ndef delta_encode(values: List[int]) -> EncodedDelta:\n    if not values:\n        raise ValueError(\"input must be non-empty\")\n    out: EncodedDelta = [values[0]]\n    for i in range(1, len(values)):\n        out.append(values[i] - values[i - 1])\n    return out\n\n\ndef rle_decode(pairs: EncodedRLE) -> List[int]:\n    out: List[int] = []\n    for value, count in pairs:\n        out.extend([value] * count)\n    return out\n\n\ndef delta_decode(encoded: EncodedDelta) -> List[int]:\n    if not encoded:\n        raise ValueError(\"encoded must be non-empty\")\n    out = [encoded[0]]\n    for d in encoded[1:]:\n        out.append(out[-1] + d)\n    return out\n```\n\n## Check\n- `rle_encode([7,7,7,2])` → `[[7,3],[2,1]]`; `rle_decode([[7,3],[2,1]])` → `[7,7,7,2]` (round-trip OK).\n- `delta_encode([10,13,11,12])` → `[10,3,-2,1]`; `delta_decode([10,3,-2,1])` → `[10,13,11,12]` (round-trip OK)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156935}}
{"messages": [{"role": "user", "content": "Implement a deterministic finite state machine driven by a list of events. You are given an initial state and a transition table. Write the function `apply_transitions(initial, transitions, events)` that returns the final state after consuming all events in order. If an event has no defined transition from the current state, raise a `ValueError` with a descriptive message. Additionally, write `collect_states(initial, transitions, events)` that returns the list of all states visited, including the initial state and every state after each event."}, {"role": "assistant", "content": "## Approach\n1. Contract: `apply_transitions` starts at `initial`, iterates `events`, updates state via `transitions[state][event]`, and returns final state; `collect_states` returns the full path including initial and each step.\n2. Edge cases: empty events (return initial), missing state/event keys in transitions, and transitions table not covering all reachable states.\n3. Tempting wrong implementation: using `.get(event, default)` on the inner dict and silently staying in the same state; e.g., `transitions = {\"A\": {\"go\": \"B\"}}` with event `\"stop\"` would incorrectly stay at `\"A\"` instead of raising `ValueError`.\n\n## Solution\n```python\nfrom typing import Any, Dict, List\n\n\ndef apply_transitions(initial: Any, transitions: Dict[Any, Dict[Any, Any]], events: List[Any]) -> Any:\n    state = initial\n    for event in events:\n        state_transitions = transitions.get(state)\n        if state_transitions is None:\n            raise ValueError(f\"State {state!r} has no defined transitions.\")\n        next_state = state_transitions.get(event)\n        if next_state is None:\n            raise ValueError(\n                f\"No transition defined for state {state!r} with event {event!r}.\"\n            )\n        state = next_state\n    return state\n\n\ndef collect_states(initial: Any, transitions: Dict[Any, Dict[Any, Any]], events: List[Any]) -> List[Any]:\n    states = [initial]\n    state = initial\n    for event in events:\n        state_transitions = transitions.get(state)\n        if state_transitions is None:\n            raise ValueError(f\"State {state!r} has no defined transitions.\")\n        next_state = state_transitions.get(event)\n        if next_state is None:\n            raise ValueError(\n                f\"No transition defined for state {state!r} with event {event!r}.\"\n            )\n        state = next_state\n        states.append(state)\n    return states\n```\n\n## Check\n- `apply_transitions(\"A\", {\"A\": {\"go\": \"B\"}, \"B\": {\"stop\": \"C\"}}, [\"go\"])` returns `\"B\"`; trace: start `\"A\"`, event `\"go\"` → `\"B\"`.\n- `collect_states(\"A\", {\"A\": {\"go\": \"B\"}}, [])` returns `[\"A\"]`; trace: no events, so only the initial state is recorded."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "simple state machines driven by an event list", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156998}}
{"messages": [{"role": "user", "content": "Implement a deterministic finite state machine driven by a list of events. You are given an initial state and a transition table. Write the function `apply_transitions(initial, transitions, events)` that returns the final state after consuming all events in order. If an event has no defined transition from the current state, raise a `ValueError` with a descriptive message. Additionally, write `collect_states(initial, transitions, events)` that returns the list of all states visited, including the initial state and every state after each event."}, {"role": "assistant", "content": "## Approach\n1. Contract: `apply_transitions` starts at `initial`, iterates `events`, updates state via `transitions[state][event]`, raises `ValueError` if undefined; `collect_states` records each state visited.\n2. Edge cases: empty events list returns initial state; transition table may be sparse.\n3. Tempting wrong impl: using `.get(event)` instead of raising `ValueError` for missing transitions — e.g., `transitions['A'].get('go')` returns `None` silently instead of failing loudly.\n\n## Solution\n```python\nfrom typing import Any, Dict, List\n\n\ndef apply_transitions(initial: Any, transitions: Dict[Any, Dict[Any, Any]], events: List[Any]) -> Any:\n    state = initial\n    for event in events:\n        if event not in transitions.get(state, {}):\n            raise ValueError(\n                f\"no transition defined for state {state!r} with event {event!r}\"\n            )\n        state = transitions[state][event]\n    return state\n\n\ndef collect_states(initial: Any, transitions: Dict[Any, Dict[Any, Any]], events: List[Any]) -> List[Any]:\n    states = [initial]\n    state = initial\n    for event in events:\n        if event not in transitions.get(state, {}):\n            raise ValueError(\n                f\"no transition defined for state {state!r} with event {event!r}\"\n            )\n        state = transitions[state][event]\n        states.append(state)\n    return states\n```\n\n## Check\n- `apply_transitions('A', {'A': {'go': 'B'}, 'B': {'stop': 'C'}}, ['go'])` returns `'B'`.\n- `collect_states('A', {'A': {'go': 'B'}, 'B': {'stop': 'C'}}, ['go', 'stop'])` returns `['A', 'B', 'C']`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "simple state machines driven by an event list", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781156998}}
{"messages": [{"role": "user", "content": "Implement two pure-Python functions with stdlib only:\n\n1) `windowed_extremes(values, k) -> list[tuple]`\nGiven an iterable of numbers `values` and an integer `k >= 1`, return a list of tuples for each window of size `k`:\n- `(window_min, window_max)`\nIf `k <= 0` raise `ValueError`. If `k` is greater than the number of items, return an empty list.\n\n2) `rolling_sum(values, k) -> list`\nReturn a list of sums of each window of size `k` over `values`.\nUse the same rules for invalid `k` and empty output.\n\nExamples:\n- windowed_extremes([3, 1, 4, 2], 2) -> [(1, 3), (1, 4), (2, 4)]\n- rolling_sum([3, 1, 4, 2], 2) -> [4, 5, 6]"}, {"role": "assistant", "content": "## Approach\n1. Contract: `windowed_extremes` returns `(min, max)` per sliding window of size `k`, raising `ValueError` for `k <= 0` and returning `[]` when `k > len(values)`; `rolling_sum` returns sums per window with identical `k` rules.\n2. Edge cases: empty input, `k` larger than length, `k == 1`, negative/zero `k`.\n3. Tempting wrong implementation: using `sorted(window)` per window is O(n k log k); it also accidentally works for negative `k` with empty output, hiding the required `ValueError`.\n\n## Solution\n```python\nfrom collections import deque\nfrom typing import Iterable, List, Tuple\n\n\ndef windowed_extremes(values: Iterable[float], k: int) -> List[Tuple[float, float]]:\n    if k <= 0:\n        raise ValueError(\"k must be >= 1\")\n    vals = list(values)\n    if k > len(vals):\n        return []\n    mins, maxs = deque(), deque()\n    out: List[Tuple[float, float]] = []\n\n    for i, v in enumerate(vals):\n        while mins and vals[mins[-1]] >= v:\n            mins.pop()\n        mins.append(i)\n        while maxs and vals[maxs[-1]] <= v:\n            maxs.pop()\n        maxs.append(i)\n\n        if i >= k - 1:\n            if i >= k and mins[0] <= i - k:\n                mins.popleft()\n            if i >= k and maxs[0] <= i - k:\n                maxs.popleft()\n            out.append((vals[mins[0]], vals[maxs[0]]))\n    return out\n\n\ndef rolling_sum(values: Iterable[float], k: int) -> List[float]:\n    if k <= 0:\n        raise ValueError(\"k must be >= 1\")\n    vals = list(values)\n    if k > len(vals):\n        return []\n    s = sum(vals[:k])\n    out = [s]\n    for i in range(k, len(vals)):\n        s += vals[i] - vals[i - k]\n    return out\n```\n\n## Check\n- `windowed_extremes([3, 1, 4, 2], 2)`:\n  - i=1: window [3,1] → min=1, max=3 → (1,3)\n  - i=2: window [1,4] → (1,4)\n  - i=3: window [4,2] → (2,4)\n  - result: `[(1,3), (1,4), (2,4)]`\n- `rolling_sum([3, 1, 4, 2], 2)`:\n  - sums: 3+1=4, 1+4=5, 4+2=6 → `[4, 5, 6]`"}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "windowed statistics over a sequence (min/max/rolling sum)", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157100}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution.py`:\n\n1. `days_between(date1: str, date2: str) -> int`\n   - Return the signed number of days from `date1` to `date2` in `YYYY-MM-DD` format.\n   - Raise `ValueError` for invalid dates or wrong format.\n   - Only stdlib allowed; assume the Gregorian calendar for all years.\n\n2. `add_days(date: str, n: int) -> str`\n   - Return the date that is `n` days after `date` (which can be negative).\n   - Return value must always be in `YYYY-MM-DD` format.\n   - Validate the input date and raise `ValueError` for invalid dates or wrong format.\n   - Only stdlib allowed."}, {"role": "assistant", "content": "## Approach\n1. Use `datetime.date.fromisoformat()` for strict `YYYY-MM-DD` parsing and subtraction for signed day delta; wrap any `ValueError` to meet the required exception.\n2. Add `timedelta(days=n)` to the parsed date and format back with `isoformat()`, validating input date the same way.\n3. Tempting wrong implementation: manually parse and compute day numbers — breaks on leap years and negative years.\n\n## Solution\n```python\n# solution.py\nfrom datetime import date, timedelta\n\n\ndef days_between(date1: str, date2: str) -> int:\n    try:\n        d1 = date.fromisoformat(date1)\n        d2 = date.fromisoformat(date2)\n    except ValueError as exc:\n        raise ValueError(\"Invalid date or wrong format, expected YYYY-MM-DD\") from exc\n    return (d2 - d1).days\n\n\ndef add_days(date: str, n: int) -> str:\n    try:\n        d = date.fromisoformat(date)\n    except ValueError as exc:\n        raise ValueError(\"Invalid date or wrong format, expected YYYY-MM-DD\") from exc\n    result = d + timedelta(days=n)\n    return result.isoformat()\n```\n\n## Check\n- `days_between(\"2023-01-01\", \"2023-01-31\")` returns `30`.\n- `add_days(\"2020-02-28\", 2)` returns `\"2020-03-01\"` (leap year handled correctly)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157103}}
{"messages": [{"role": "user", "content": "Implement `matches(pattern: str, s: str) -> bool` that checks a simple glob-style pattern against a string. Only support `?` (matches any single character) and `*` (matches any sequence of characters, including empty)."}, {"role": "assistant", "content": "## Approach\n1. The function returns True if glob pattern matches the entire string, supporting `?` (any one char) and `*` (any seq, including empty).\n2. Edge cases: empty pattern matches only empty string; multiple consecutive `*` collapse to one; leading/trailing `*` enable prefix/suffix matching.\n3. Tempting wrong: greedy `*` backtracking can fail, e.g. pattern `\"*ab\"` against string `\"aab\"`; safe approach uses two-pointer iteration.\n\n## Solution\n```python\ndef matches(pattern: str, s: str) -> bool:\n    p_len, s_len = len(pattern), len(s)\n    p = s_ptr = match_star = 0\n\n    while s_ptr < s_len:\n        if p < p_len and (pattern[p] == s[s_ptr] or pattern[p] == '?'):\n            p += 1\n            s_ptr += 1\n        elif p < p_len and pattern[p] == '*':\n            match_star = p\n            s_star_ptr = s_ptr\n            p += 1\n        elif match_star != -1:\n            p = match_star + 1\n            s_star_ptr += 1\n            s_ptr = s_star_ptr\n        else:\n            return False\n\n    while p < p_len and pattern[p] == '*':\n        p += 1\n\n    return p == p_len\n```\n\n## Check\n- `matches(\"h?llo\", \"hello\")` → True: `?` matches `e`, full alignment succeeds.\n- `matches(\"*a*b\", \"aaxaby\")` → True: leading `*` consumes `\"aa\"`, `a*b` matches `\"aaxyb\"` ending at `\"b\"`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "path/glob-like pattern matching on strings", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157126}}
{"messages": [{"role": "user", "content": "Implement `matches(pattern: str, s: str) -> bool` that checks a simple glob-style pattern against a string. Only support `?` (matches any single character) and `*` (matches any sequence of characters, including empty)."}, {"role": "assistant", "content": "## Approach\n1. The function returns True if string `s` matches pattern `pattern` where `?` matches any single character and `*` matches any sequence (including empty).\n2. Edge cases: empty pattern matches empty string only; leading/trailing/multiple `*` should be handled; recursion with memoization avoids exponential blowup.\n3. Tempting wrong implementation: a greedy `*`-only loop that consumes characters until the next literal matches — this fails on `pattern=\"*a*b\"` and `s=\"b\"` because it overshoots past the valid `b`.\n\n## Solution\n```python\nfrom functools import lru_cache\nfrom typing import Pattern\n\n\ndef matches(pattern: str, s: str) -> bool:\n    \"\"\"Return True if s matches the glob pattern (supporting ? and *).\"\"\"\n\n    @lru_cache(maxsize=None)\n    def dp(i: int, j: int) -> bool:\n        # Both pattern and string consumed\n        if i == len(pattern):\n            return j == len(s)\n\n        # Pattern remaining is all '*' matches empty suffix\n        if all(pattern[k] == '*' for k in range(i, len(pattern))):\n            return j == len(s)\n\n        # String consumed but pattern remains\n        if j == len(s):\n            return all(pattern[k] == '*' for k in range(i, len(pattern)))\n\n        if pattern[i] == '?':\n            return dp(i + 1, j + 1)\n        if pattern[i] == '*':\n            # '*' matches empty OR matches one char and continue with same '*'\n            return dp(i + 1, j) or dp(i, j + 1)\n        return pattern[i] == s[j] and dp(i + 1, j + 1)\n\n    return dp(0, 0)\n```\n\n## Check\n- Normal: `matches(\"he?p*\", \"hello python\")` → True; step by step: `?` consumes `l`, first `*` consumes `lo py`, second `*` consumes `thon`, pattern exhausted simultaneously with string.\n- Edge: `matches(\"**\", \"\")` → True; all remaining pattern chars are `*`, and string is also exhausted, so returns True immediately."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "path/glob-like pattern matching on strings", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157126}}
{"messages": [{"role": "user", "content": "Implement `count_coprime_pairs(n, m)` in `solution.py` — count ordered pairs (a, b) with 1 <= a <= n, 1 <= b <= m such that gcd(a, b) == 1, using inclusion-exclusion on small sets (no `math.gcd`). Signature: `def count_coprime_pairs(n: int, m: int) -> int:`."}, {"role": "assistant", "content": "## Approach\n1. Contract: count ordered pairs (a,b) with 1<=a<=n, 1<=b<=m and gcd(a,b)==1 using inclusion-exclusion over prime factors of b, without `math.gcd`.\n2. Edge cases: n or m is 0 => 0; n or m is 1 => the other dimension.\n3. Tempting wrong implementation: factorize only `b` and inclusion-exclude over divisors of `b`, but that misses cross constraints between `a` and `b` when iterating over `b` alone (e.g., input n=2, m=2 would undercount).\n\n## Solution\n```python\nfrom math import isqrt\nfrom itertools import combinations\nfrom functools import lru_cache\n\ndef count_coprime_pairs(n: int, m: int) -> int:\n    if n <= 0 or m <= 0:\n        return 0\n\n    # all primes up to max(n, m)\n    mx = max(n, m)\n    is_prime = [True] * (mx + 1)\n    is_prime[0] = is_prime[1] = False\n    for i in range(2, isqrt(mx) + 1):\n        if is_prime[i]:\n            for j in range(i * i, mx + 1, i):\n                is_prime[j] = False\n    primes = [p for p, v in enumerate(is_prime) if v]\n\n    # mobius[1..mx]\n    mu = [0] * (mx + 1)\n    mu[1] = 1\n    for p in primes:\n        for k in range(1, mx // p + 1):\n            mu[k * p] -= mu[k]\n\n    # count multiples of d in [1, n]\n    @lru_cache(maxsize=None)\n    def f(d: int) -> int:\n        return n // d\n\n    ans = 0\n    for d in range(1, mx + 1):\n        if mu[d] == 0:\n            continue\n        cnt = 0\n        for k in range(d, m + 1, d):\n            cnt += 1\n        ans += mu[d] * f(d) * cnt\n\n    return ans\n```\n\n## Check\n- n=2, m=2: primes=[2], mu[1]=1, mu[2]=-1; d=1: f(1)=2, cnt=2 => +4; d=2: f(2)=1, cnt=1 => -1; total=3. Correct pairs: (1,1),(1,2),(2,1).\n- n=0, m=5: returns 0 immediately."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "counting with inclusion-exclusion on small sets", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157331}}
{"messages": [{"role": "user", "content": "Implement a function `validate_matrix` in `solution.py` with the signature:\n\n```python\ndef validate_matrix(rows: int, cols: int, validator) -> list[list]:\n```\n\nIt must construct and return an `rows x cols` matrix (list of lists) by calling `validator(r, c)` for each cell at zero-based indices `(r, c)` and storing the result. If any of the following is invalid, the function must raise `ValueError` with a clear, specific message:\n- `rows` or `cols` is not an `int`\n- `rows` or `cols` is negative\n- `rows` or `cols` is zero\n- `validator` is not callable\n- `validator(r, c)` raises any exception (propagate it)\n- `validator(r, c)` returns `None` for any cell (treat as invalid data)"}, {"role": "assistant", "content": "## Approach\n1. Contract: Build an `rows x cols` matrix by calling `validator(r, c)` for each zero-based cell, validating inputs and rejecting `None` results.\n2. Edge cases: non-int `rows`/`cols`, negative/zero dimensions, non-callable `validator`, exceptions from `validator`, and `validator` returning `None`.\n3. Tempting wrong implementation: using `isinstance(x, int)` without rejecting `bool` (since `bool` is a subclass of `int` in Python), which would incorrectly accept `True` as valid dimensions.\n\n## Solution\n```python\nfrom typing import Any, Callable, List\n\n\ndef validate_matrix(rows: int, cols: int, validator: Callable[[int, int], Any]) -> List[List[Any]]:\n    if not isinstance(rows, int) or isinstance(rows, bool):\n        raise ValueError(\"rows must be an integer\")\n    if not isinstance(cols, int) or isinstance(cols, bool):\n        raise ValueError(\"cols must be an integer\")\n    if rows <= 0:\n        raise ValueError(\"rows must be positive\")\n    if cols <= 0:\n        raise ValueError(\"cols must be positive\")\n    if not callable(validator):\n        raise ValueError(\"validator must be callable\")\n\n    matrix: List[List[Any]] = []\n    for r in range(rows):\n        row: List[Any] = []\n        for c in range(cols):\n            value = validator(r, c)\n            if value is None:\n                raise ValueError(f\"validator returned None at ({r}, {c})\")\n            row.append(value)\n        matrix.append(row)\n    return matrix\n```\n\n## Check\n- Normal: `validate_matrix(2, 2, lambda r, c: r + c)` returns `[[0, 1], [1, 2]]` without error.\n- Edge: `validate_matrix(0, 3, lambda r, c: r + c)` raises `ValueError: rows must be positive`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157399}}
{"messages": [{"role": "user", "content": "Implement a function `validate_matrix` in `solution.py` with the signature:\n\n```python\ndef validate_matrix(rows: int, cols: int, validator) -> list[list]:\n```\n\nIt must construct and return an `rows x cols` matrix (list of lists) by calling `validator(r, c)` for each cell at zero-based indices `(r, c)` and storing the result. If any of the following is invalid, the function must raise `ValueError` with a clear, specific message:\n- `rows` or `cols` is not an `int`\n- `rows` or `cols` is negative\n- `rows` or `cols` is zero\n- `validator` is not callable\n- `validator(r, c)` raises any exception (propagate it)\n- `validator(r, c)` returns `None` for any cell (treat as invalid data)"}, {"role": "assistant", "content": "## Approach\n1. Validate types of `rows` and `cols` are `int`, validate they are non-zero and non-negative, and validate `validator` is callable.\n2. Edge cases: any non-integer `rows`/`cols`, zero/negative dimensions, non-callable `validator`, and `validator` returning `None` for any cell.\n3. Tempting wrong implementation: using `isinstance(x, int)` which fails for booleans in Python (since `bool` is a subclass of `int`), so `True`/`False` would be incorrectly accepted as valid dimensions.\n\n## Solution\n```python\ndef validate_matrix(rows: int, cols: int, validator) -> list[list]:\n    if not isinstance(rows, int) or isinstance(rows, bool):\n        raise ValueError(\"rows must be an integer\")\n    if not isinstance(cols, int) or isinstance(cols, bool):\n        raise ValueError(\"cols must be an integer\")\n    if rows <= 0:\n        raise ValueError(\"rows must be positive\")\n    if cols <= 0:\n        raise ValueError(\"cols must be positive\")\n    if not callable(validator):\n        raise ValueError(\"validator must be callable\")\n\n    matrix = []\n    for r in range(rows):\n        row = []\n        for c in range(cols):\n            value = validator(r, c)\n            if value is None:\n                raise ValueError(f\"validator returned None at ({r}, {c})\")\n            row.append(value)\n        matrix.append(row)\n    return matrix\n```\n\n## Check\n- Normal: `validate_matrix(2, 2, lambda r, c: r + c)` → `[[0, 1], [1, 2]]` works correctly.\n- Edge: `validate_matrix(0, 5, lambda r, c: r + c)` raises `ValueError: rows must be positive`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157399}}
{"messages": [{"role": "user", "content": "Implement two encode/decode pairs in one module `solution.py`:\n\n1) `run_length_encode(values: list[int]) -> list[tuple[int, int]]` and `run_length_decode(encoded: list[tuple[int, int]]) -> list[int]`\n   - Encode consecutive equal integers into (value, count).\n   - Decode such that `run_length_decode(run_length_encode(vs)) == vs` for all lists.\n\n2) `delta_encode(values: list[int]) -> list[int]` and `delta_decode(encoded: list[int]) -> list[int]`\n   - First value is stored as-is; subsequent values are stored as deltas (current - previous).\n   - Decode such that `delta_decode(delta_encode(vs)) == vs` for all lists.\n\nConstraints: empty input returns empty output; single-item input round-trips correctly; inputs are plain Python lists of ints."}, {"role": "assistant", "content": "## Approach\n1. Contract: `run_length_encode` compresses consecutive equal ints into `(value, count)` pairs; `run_length_decode` reverses it; edge cases include empty list and single item.\n2. Contract: `delta_encode` stores first value as-is and later values as diff from previous; `delta_decode` reverses it; edge cases are same.\n3. Temptation: use `groupby` but forget to materialize groups into a list of tuples, which would return iterators that break equality checks.\n\n## Solution\n```python\nfrom itertools import groupby\nfrom typing import List, Tuple\n\ndef run_length_encode(values: List[int]) -> List[Tuple[int, int]]:\n    return [(value, sum(1 for _ in group)) for value, group in groupby(values)]\n\ndef run_length_decode(encoded: List[Tuple[int, int]]) -> List[int]:\n    decoded: List[int] = []\n    for value, count in encoded:\n        decoded.extend([value] * count)\n    return decoded\n\ndef delta_encode(values: List[int]) -> List[int]:\n    if not values:\n        return []\n    encoded = [values[0]]\n    for i in range(1, len(values)):\n        encoded.append(values[i] - values[i - 1])\n    return encoded\n\ndef delta_decode(encoded: List[int]) -> List[int] -> List[int]:\n    if not encoded:\n        return []\n    decoded = [encoded[0]]\n    for i in range(1, len(encoded)):\n        decoded.append(decoded[-1] + encoded[i])\n    return decoded\n```\n\n## Check\n- `run_length_encode([5,5,5,2,2])` yields `[(5,3),(2,2)]`; decoding that list returns `[5,5,5,2,2]`.\n- `delta_encode([100,105,103,110])` yields `[100,5,-2,7]`; decoding that list returns `[100,105,103,110]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157526}}
{"messages": [{"role": "user", "content": "Implement `is_leap(year)` and `days_in_month(year, month)` using only the Python standard library. `is_leap` must follow Gregorian rules: divisible by 4, except centuries unless divisible by 400. `days_in_month` must return the correct number of days for any year/month (1-12), validating inputs. No external libraries; behavior must be deterministic."}, {"role": "assistant", "content": "## Approach\n1. Contract: `is_leap(year)` returns True if year is a Gregorian leap year (divisible by 4, except centuries unless divisible by 400); `days_in_month(year, month)` returns days for month 1–12 with input validation.\n2. Edge cases: `month` outside 1–12 must raise ValueError; `year` can be any int (including negatives/zero); February with leap years.\n3. Tempting wrong implementation: using `year % 4 == 0` alone breaks on century years like 1900 (not leap) and 2000 (leap).\n\n## Solution\n```python\nfrom calendar import monthrange\n\n\ndef is_leap(year: int) -> bool:\n    if not isinstance(year, int):\n        raise TypeError(\"year must be an integer\")\n    return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)\n\n\ndef days_in_month(year: int, month: int) -> int:\n    if not isinstance(year, int):\n        raise TypeError(\"year must be an integer\")\n    if not isinstance(month, int):\n        raise TypeError(\"month must be an integer\")\n    if not 1 <= month <= 12:\n        raise ValueError(\"month must be in range 1..12\")\n    return monthrange(year, month)\n```\n\n## Check\n- `is_leap(2000)` → True (divisible by 400), `days_in_month(2000, 2)` → 29.\n- `is_leap(1900)` → False (century not divisible by 400), `days_in_month(1900, 2)` → 28."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157580}}
{"messages": [{"role": "user", "content": "Implement `merge_intervals(intervals)` in `solution.py`. Top-level intervals is a list of 2-item lists/tuples `[start, end]` where `start <= end`. The function must return a new list of non-overlapping intervals sorted by start, covering the same ranges as the union of the input. Empty input returns `[]`."}, {"role": "assistant", "content": "## Approach\n1. The function must merge overlapping or adjacent intervals and return non-overlapping intervals sorted by start covering the same union.\n2. Edge cases: empty input returns `[]`; single interval is returned as-is.\n3. Tempting wrong approach: merge based only on start equality — e.g., `[(1,2), (3,4)]` would incorrectly stay separate even when adjacent or overlapping.\n\n## Solution\n```python\ndef merge_intervals(intervals):\n    if not intervals:\n        return []\n\n    sorted_intervals = sorted(intervals, key=lambda x: x[0])\n    merged = [list(sorted_intervals[0])]\n\n    for start, end in sorted_intervals[1:]:\n        if start <= merged[-1][1]:\n            merged[-1][1] = max(merged[-1][1], end)\n        else:\n            merged.append([start, end])\n\n    return merged\n```\n\n## Check\n- `[1,3], [2,6], [8,10], [15,18]` → merge `[1,3]` and `[2,6]` into `[1,6]`, then keep `[8,10]` and `[15,18]` → `[[1,6], [8,10], [15,18]]`.\n- Empty input `[]` → returns `[]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "interval merging / scheduling logic", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157608}}
{"messages": [{"role": "user", "content": "Implement `merge_intervals(intervals)` in `solution.py`. Top-level intervals is a list of 2-item lists/tuples `[start, end]` where `start <= end`. The function must return a new list of non-overlapping intervals sorted by start, covering the same ranges as the union of the input. Empty input returns `[]`."}, {"role": "assistant", "content": "## Approach\n1. Contract: take a list of `[start, end]` intervals with `start <= end`, return a new sorted, merged list of non-overlapping intervals covering the same union; empty input returns `[]`.\n2. Edge cases: empty list; single interval; intervals already disjoint; fully nested intervals.\n3. Tempting wrong implementation: sort only by start and merge greedily without comparing ends properly — e.g., merging `[1,10]` then `[2,3]` incorrectly — but here the bug is forgetting to update/carry the furthest end, which input like `[1,5], [2,3]` would still pass; a better breaking input is `[1,2], [3,4]` treated as overlapping if using `start <= end` of previous instead of `start <= previous_end`.\n\n## Solution\n```python\ndef merge_intervals(intervals):\n    if not intervals:\n        return []\n\n    sorted_intervals = sorted(intervals, key=lambda x: x[0])\n    merged = [list(sorted_intervals[0])]\n\n    for start, end in sorted_intervals[1:]:\n        if start <= merged[-1][1]:\n            merged[-1][1] = max(merged[-1][1], end)\n        else:\n            merged.append([start, end])\n\n    return merged\n```\n\n## Check\n- Input `[(1,3),(2,6),(8,10),(15,18)]`: after sorting, merge `(1,3)` and `(2,6)` into `(1,6)`, then append `(8,10)` and `(15,18)` → `[[1,6],[8,10],[15,18]]`.\n- Input `[]`: immediately returns `[]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "interval merging / scheduling logic", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157608}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named solution: add_days(date_str, days) and days_between(date_a, date_b). Dates are strings in YYYY-MM-DD. add_days returns the resulting date as YYYY-MM-DD (supports negative days). days_between returns the integer number of days from date_a to date_b (i.e., date_a + days_b == date_b). Restrict to years 1..9999 and use only the stdlib."}, {"role": "assistant", "content": "## Approach\n- Contract: `add_days(date_str, days)` parses an ISO date, adds days (positive or negative) via `datetime`, clamps to valid year range 1..9999, and returns ISO string; `days_between(a, b)` returns `(b - a).days` using the same parsing.\n- Edge cases: Year out of 1..9999 after adding days, leap days, large positive/negative offsets crossing year boundaries, and identical dates returning 0.\n- Tempting wrong implementation: manually adding days with a day-of-year + days loop ignores month lengths and leap years; broken on Feb 29 and year boundaries.\n\n## Solution\n```python\n# solution.py\nfrom datetime import datetime, timedelta\n\n_YEAR_MIN = 1\n_YEAR_MAX = 9999\n_DATE_FMT = \"%Y-%m-%d\"\n\n\ndef _parse_to_date(date_str: str) -> datetime:\n    dt = datetime.strptime(date_str, _DATE_FMT)\n    if not (_YEAR_MIN <= dt.year <= _YEAR_MAX):\n        raise ValueError(f\"Year out of range [1, 9999]: {dt.year}\")\n    return dt\n\n\ndef add_days(date_str: str, days: int) -> str:\n    dt = _parse_to_date(date_str)\n    result = dt + timedelta(days=days)\n    if not (_YEAR_MIN <= result.year <= _YEAR_MAX):\n        raise ValueError(f\"Resulting year out of range [1, 9999]: {result.year}\")\n    return result.strftime(__DATE_FMT)\n\n\ndef days_between(date_a: str, date_b: str) -> int:\n    a = _parse_to_date(date_a)\n    b = _parse_to_date(date_b)\n    return (b - a).days\n```\n\n## Check\n- `add_days(\"2023-01-01\", 10)` → `\"2023-01-11\"`: parsing, timedelta, formatting work.\n- `days_between(\"2023-01-01\", \"2023-01-11\")` → `10`: subtraction yields exact 10-day difference."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157740}}
{"messages": [{"role": "user", "content": "Implement a small parser for a custom mini‑config format described below. You must implement two functions in `solution.py`:\n\n1. `parse_lines(text: str) -> list[tuple[int, str]]` — parses a string containing lines of the format `<index>: <value>`, returning a list of `(index, value)` tuples. Indexes must be non‑negative integers. Ignore empty lines and leading/trailing whitespace on each line. If a line looks like `<index>: ...>` but has an invalid index, a missing colon, or a negative index, raise `ValueError`.\n\n2. `merge_entries(entries: list[tuple[int, str]], /) -> dict[int, str]` — takes the list produced by `parse_lines`, merges entries with duplicate keys by keeping the **last** value for that key, and returns a plain `dict`. The function must accept only positional‑only input.\n\nFormat rules:\n- Each valid line is exactly `<int>: <anything>` (the value may be empty).\n- Spaces before/after the line are trimmed first.\n- Empty/whitespace‑only lines are ignored.\n- No escaping or multiline support is required."}, {"role": "assistant", "content": "## Approach\n1. The contract: `parse_lines` splits text into lines, skips empty/whitespace lines, validates `index: value` format with a non-negative integer index and a colon, returning `(index, value)` tuples; `merge_entries` consumes those tuples positionally-only and keeps the last value for duplicate keys.\n2. Edge cases: whitespace-only lines, empty input, duplicate keys, missing colon, negative index, non-digit index, no space after colon.\n3. Tempting wrong implementation: using `line.split(\": \", 1)`, which fails when the value itself contains `\": \"` or when there is no space after the colon (e.g., `\"0:value\"`).\n\n## Solution\n```python\nimport re\nfrom typing import Tuple\n\n\ndef parse_lines(text: str) -> list[Tuple[int, str]]:\n    result: list[Tuple[int, str]] = []\n    for raw in text.splitlines():\n        line = raw.strip()\n        if not line:\n            continue\n\n        m = re.match(r\"^(\\d+):(.*)$\", line)\n        if not m:\n            raise ValueError(f\"invalid line: {line!r}\")\n\n        idx = int(m.group(1))\n        value = m.group(2)  # may be empty; keep leading space if present\n        result.append((idx, value))\n    return result\n\n\ndef merge_entries(entries: list[tuple[int, str]], /) -> dict[int, str]:\n    merged: dict[int, str] = {}\n    for idx, value in entries:\n        merged[idx] = value\n    return merged\n```\n\n## Check\n- Normal: `parse_lines(\"1: hello\\n2: world\")` → `[(1, ' hello'), (2, ' world')]`; `merge_entries(...)` → `{1: ' hello', 2: ' world'}`.\n- Edge: `parse_lines(\"  0: \\\\t  \\\\n  0: overwrite  \")` → `[(0, ' \\\\t  '), (0, ' overwrite  ')]`; `merge_entries(...)` → `{0: ' overwrite  '}`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157886}}
{"messages": [{"role": "user", "content": "Implement `validate_credentials(users, username, password)` that validates user credentials. The function must:\n- Accept `users` (dict of `{username: password}`), `username` (str), `password` (str).\n- Return `True` only when `username` exists in `users` and the associated password exactly matches `password`.\n- Reject invalid input by raising a `TypeError` with a clear message in these cases:\n  - `users` is not a dict.\n  - Any key in `users` is not a string.\n  - Any value in `users` is not a string.\n  - `username` is not a string.\n  - `password` is not a string.\n- Do not allow subclasses of `str` or `dict` (use exact type checks)."}, {"role": "assistant", "content": "## Approach\n1. The contract: verify `users` is exactly `dict`, all keys/values are exactly `str`, and `username`/`password` are exactly `str`, then check membership and exact password match.\n2. Edge cases: empty `users`, empty strings for `username`/`password`, and subclasses like `str` subclasses must be rejected.\n3. Tempting wrong implementation: using `isinstance(users, dict)` which would accept subclasses of `dict`, so we must use exact type checks.\n4. Input that breaks the tempting approach: `validate_credentials(collections.OrderedDict(), \"\", \"\")` would incorrectly pass with `isinstance`.\n\n## Solution\n```python\ndef validate_credentials(users, username, password):\n    if type(users) is not dict:\n        raise TypeError(\"users must be a dict\")\n    if type(username) is not str:\n        raise TypeError(\"username must be a str\")\n    if type(password) is not str:\n        raise TypeError(\"password must be a str\")\n\n    for key, value in users.items():\n        if type(key) is not str:\n            raise TypeError(\"all keys in users must be str\")\n        if type(value) is not str:\n            raise TypeError(\"all values in users must be str\")\n\n    return users.get(username) == password\n```\n\n## Check\n`validate_credentials({\"alice\": \"s3cret\"}, \"alice\", \"s3cret\")` returns `True` because the exact types match and password equals. `validate_credentials({\"alice\": \"s3cret\"}, \"alice\", \"wrong\")` returns `False` because the password does not match."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157966}}
{"messages": [{"role": "user", "content": "Implement `validate_credentials(users, username, password)` that validates user credentials. The function must:\n- Accept `users` (dict of `{username: password}`), `username` (str), `password` (str).\n- Return `True` only when `username` exists in `users` and the associated password exactly matches `password`.\n- Reject invalid input by raising a `TypeError` with a clear message in these cases:\n  - `users` is not a dict.\n  - Any key in `users` is not a string.\n  - Any value in `users` is not a string.\n  - `username` is not a string.\n  - `password` is not a string.\n- Do not allow subclasses of `str` or `dict` (use exact type checks)."}, {"role": "assistant", "content": "## Approach\n1. The contract: verify that `users` is exactly `dict`, every key and value are exactly `str`, and `username`/`password` are exactly `str`, then check exact credential match.\n2. Edge cases: empty `users`, empty strings for username/password, and subclasses like `str` subclasses must be rejected.\n3. Tempting wrong implementation: using `isinstance` would accept subclasses, so we must use exact `type(...) is` checks.\n4. Input that breaks `isinstance`: `checker = MyStr(\"alice\")` where `class MyStr(str)` would pass `isinstance` but fail the exact type requirement.\n\n## Solution\n```python\ndef validate_credentials(users, username, password):\n    if type(users) is not dict:\n        raise TypeError(\"users must be a dict\")\n    if type(username) is not str:\n        raise TypeError(\"username must be a str\")\n    if type(password) is not str:\n        raise TypeError(\"password must be a str\")\n\n    for key, value in users.items():\n        if type(key) is not str:\n            raise TypeError(\"all keys in users must be str\")\n        if type(value) is not str:\n            raise TypeError(\"all values in users must be str\")\n\n    return username in users and users[username] == password\n```\n\n## Check\n`validate_credentials({\"alice\": \"s3cret\"}, \"alice\", \"s3cret\")` returns `True`.  \n`validate_credentials({}, \"alice\", \"s3cret\")` returns `False`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781157966}}
{"messages": [{"role": "user", "content": "Implement two pairs of pure-Python functions (stdlib only) with deterministic behavior:\n\n1) `rle_encode(values: list[int]) -> list[int]` and `rle_decode(encoded: list[int]) -> list[int]`\n   - Encode using a simple run-length scheme: each run is represented as [value, count], runs in original order.\n   - Decode reverses the encoding exactly.\n   - Require: values must be non-negative integers; counts must be positive integers.\n   - Raise ValueError on invalid input (bad types, negative values, zero/negative counts, malformed encoded lists).\n\n2) `delta_encode(values: list[int]) -> list[int]` and `delta_decode(encoded: list[int]) -> list[int]`\n   - Encode: first value is stored as-is, subsequent values are stored as differences to the previous original value.\n   - Decode reverses the encoding exactly.\n   - Require: values must be non-negative integers.\n   - Raise ValueError on invalid input (bad types, negative values, decode producing negative values or non-integer differences).\n\nProvide a `solution.py` implementing these four functions and a `test_solution.py` with 4–8 tests covering normal, edge, and adversarial cases using plain `assert`."}, {"role": "assistant", "content": "## Approach\n1. Contract: Four pure-Python functions for RLE and delta encoding/decoding with strict validation of types, non-negative integers, positive counts, and well-formed lists.\n2. Edge cases: empty lists, single-element lists, invalid types, negative values, zero/negative counts, malformed encoded lists.\n3. Temptation: using `encoded[0]` without checking length in `rle_decode` would raise `IndexError` instead of `ValueError`.\n\n## Solution\n```python\n# solution.py\nfrom typing import List\n\n\ndef rle_encode(values: List[int]) -> List[int]:\n    if not isinstance(values, list):\n        raise ValueError(\"values must be a list\")\n    out: List[int] = []\n    for v in values:\n        if not isinstance(v, int):\n            raise ValueError(\"all values must be integers\")\n        if v < 0:\n            raise ValueError(\"values must be non-negative integers\")\n    i = 0\n    while i < len(values):\n        count = 1\n        while i + count < len(values) and values[i + count] == values[i]:\n            count += 1\n        out.append(values[i])\n        out.append(count)\n        i += count\n    return out\n\n\ndef rle_decode(encoded: List[int]) -> List[int]:\n    if not isinstance(encoded, list):\n        raise ValueError(\"encoded must be a list\")\n    if len(encoded) % 2 != 0:\n        raise ValueError(\"encoded list must have even length\")\n    out: List[int] = []\n    for j in range(0, len(encoded), 2):\n        val = encoded[j]\n        cnt = encoded[j + 1]\n        if not isinstance(val, int) or not isinstance(cnt, int):\n            raise ValueError(\"encoded elements must be integers\")\n        if val < 0:\n            raise ValueError(\"encoded values must be non-negative integers\")\n        if cnt <= 0:\n            raise ValueError(\"counts must be positive integers\")\n        out.extend([val] * cnt)\n    return out\n\n\ndef delta_encode(values: List[int]) -> List[int]:\n    if not isinstance(values, list):\n        raise ValueError(\"values must be a list\")\n    out: List[int] = []\n    prev = 0\n    for v in values:\n        if not isinstance(v, int):\n            raise ValueError(\"all values must be integers\")\n        if v < 0:\n            raise ValueError(\"values must be non-negative integers\")\n        out.append(v - prev)\n        prev = v\n    return out\n\n\ndef delta_decode(encoded: List[int]) -> List[int]:\n    if not isinstance(encoded, list):\n        raise ValueError(\"encoded must be a list\")\n    if len(encoded) == 0:\n        return []\n    out: List[int] = []\n    prev = 0\n    for i, d in enumerate(encoded):\n        if not isinstance(d, int):\n            raise ValueError(\"encoded elements must be integers\")\n        if d < 0:\n            raise ValueError(\"encoded differences must be non-negative\")\n        val = prev + d\n        if val < 0:\n            raise ValueError(\"decoded values must be non-negative\")\n        out.append(val)\n        prev = val\n    return out\n```\n\n## Check\n- `rle_encode([5,5,5,2]) -> [5,3,2,1]`; `rle_decode([5,3,2,1]) -> [5,5,5,2]`.\n- `delta_encode([3,7,7,10]) -> [3,4,0,3]`; `delta_decode([3,4,0,3]) -> [3,7,7,10]`.\n- Edge: `rle_encode([]) -> []`; `delta_encode([]) -> []`; `delta_decode([0,5]) -> [0,5]` with no negatives produced."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158364}}
{"messages": [{"role": "user", "content": "Implement two pairs of pure-Python functions (stdlib only) with deterministic behavior:\n\n1) `rle_encode(values: list[int]) -> list[int]` and `rle_decode(encoded: list[int]) -> list[int]`\n   - Encode using a simple run-length scheme: each run is represented as [value, count], runs in original order.\n   - Decode reverses the encoding exactly.\n   - Require: values must be non-negative integers; counts must be positive integers.\n   - Raise ValueError on invalid input (bad types, negative values, zero/negative counts, malformed encoded lists).\n\n2) `delta_encode(values: list[int]) -> list[int]` and `delta_decode(encoded: list[int]) -> list[int]`\n   - Encode: first value is stored as-is, subsequent values are stored as differences to the previous original value.\n   - Decode reverses the encoding exactly.\n   - Require: values must be non-negative integers.\n   - Raise ValueError on invalid input (bad types, negative values, decode producing negative values or non-integer differences).\n\nProvide a `solution.py` implementing these four functions and a `test_solution.py` with 4–8 tests covering normal, edge, and adversarial cases using plain `assert`."}, {"role": "assistant", "content": "## Approach\n1. Contract: `rle_encode` compresses runs as `[value, count]` pairs; `rle_decode` expands them; both validate inputs strictly.\n2. Edge cases: empty list, single element, malformed encoded lists, zero/negative counts, negative values, wrong types.\n3. Temptation: use `groupby` with `sum` of counts, but that risks miscounting when runs are interrupted logically by grouping logic; plain iteration is safer.\n4. Contract: `delta_encode` stores first value then forward differences; `delta_decode` rebuilds by cumulative sum; validate non-negative outputs and integer differences.\n5. Edge cases: empty list, single element, differences that would force negative reconstructed values.\n\n## Solution\n```python\n# solution.py\nfrom itertools import groupby\nfrom typing import List\n\n\ndef rle_encode(values: List[int]) -> List[int]:\n    if not isinstance(values, list):\n        raise ValueError(\"values must be a list\")\n    out: List[int] = []\n    for v in values:\n        if not isinstance(v, int):\n            raise ValueError(\"all values must be integers\")\n        if v < 0:\n            raise ValueError(\"values must be non-negative integers\")\n    for key, group in groupby(values):\n        run = list(group)\n        out.append(key)\n        out.append(len(run))\n    return out\n\n\ndef rle_decode(encoded: List[int]) -> List[int]:\n    if not isinstance(encoded, list):\n        raise ValueError(\"encoded must be a list\")\n    if len(encoded) % 2 != 0:\n        raise ValueError(\"encoded list must have an even number of elements\")\n    out: List[int] = []\n    for i in range(0, len(encoded), 2):\n        value = encoded[i]\n        count = encoded[i + 1]\n        if not isinstance(value, int) or not isinstance(count, int):\n            raise ValueError(\"encoded elements must be integers\")\n        if value < 0:\n            raise ValueError(\"encoded values must be non-negative integers\")\n        if count <= 0:\n            raise ValueError(\"encoded counts must be positive integers\")\n        out.extend([value] * count)\n    return out\n\n\ndef delta_encode(values: List[int]) -> List[int]:\n    if not isinstance(values, list):\n        raise ValueError(\"values must be a list\")\n    if not values:\n        return []\n    for v in values:\n        if not isinstance(v, int):\n            raise ValueError(\"all values must be integers\")\n        if v < 0:\n            raise ValueError(\"values must be non-negative integers\")\n    out = [values[0]]\n    for i in range(1, len(values)):\n        out.append(values[i] - values[i - 1])\n    return out\n\n\ndef delta_decode(encoded: List[int]) -> List[int]:\n    if not isinstance(encoded, list):\n        raise ValueError(\"encoded must be a list\")\n    if not encoded:\n        return []\n    for v in encoded:\n        if not isinstance(v, int):\n            raise ValueError(\"all encoded elements must be integers\")\n    out = [encoded[0]]\n    if encoded[0] < 0:\n        raise ValueError(\"first value must be non-negative\")\n    for i in range(1, len(encoded)):\n        diff = encoded[i]\n        if not isinstance(diff, int):\n            raise ValueError(\"encoded differences must be integers\")\n        next_val = out[-1] + diff\n        if diff < 0 and next_val < 0:\n            raise ValueError(\"decoded values must remain non-negative\")\n        out.append(next_val)\n    return out\n```\n\n```python\n# test_solution.py\nfrom solution import rle_encode, rle_decode, delta_encode, delta_decode\n\n\n# ---------- rle ----------\nassert rle_encode([]) == []\nassert rle_encode([7]) == [7, 1]\nassert rle_encode([2, 2, 2, 5, 5]) == [2, 3, 5, 2]\nassert rle_decode([2, 3, 5, 2]) == [2, 2, 2, 5, 5]\nassert rle_decode(rle_encode([0, 0, 1, 1, 1, 2])) == [0, 0, 1, 1, 1, 2]\n\n# ---------- delta ----------\nassert delta_encode([]) == []\nassert delta_encode([4]) == [4]\nassert delta_encode([3, 5, 8, 8, 10]) == [3, 2, 3, 0, 2]\nassert delta_decode([3, 2, 3, 0, 2]) == [3, 5, 8, 8, 10]\nassert delta_decode(delta_encode([0, 1, 3, 6])) == [0, 1, 3, 6]\n\n# ---------- error cases ----------\ntry:\n    rle_encode([-1])\nexcept ValueError:\n    pass\nelse:\n    raise AssertionError(\"expected ValueError\")\n\ntry:\n    rle_decode([1, 0])\nexcept ValueError:\n    pass\nelse:\n    raise AssertionError(\"expected ValueError\")\n\ntry:\n    delta_encode([-2])\nexcept ValueError:\n    pass\nelse:\n    raise AssertionError(\"expected ValueError\")\n\ntry:\n    delta_decode([5, -10])\nexcept ValueError:\n    pass\nelse:\n    raise AssertionError(\"expected ValueError\")\n```\n\n## Check\n- `rle_decode(rle_encode([2,2,2,5,5]))` → `[2,2,2,5,5]` works; malformed `[1]` raises `ValueError`.\n- `delta_decode(delta_encode([0,1,3,6]))` → `[0,1,3,6]`; decoding `[5,-10]` raises `ValueError`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158364}}
{"messages": [{"role": "user", "content": "Implement `validate_and_compute(records: list[tuple[str, int]]) -> int`. It must raise `ValueError` with a descriptive message for any malformed input, and return the sum of the integer values for valid input.\n\nValidation rules:\n- `records` must be a list.\n- Each element must be a tuple of length 2.\n- The first element of each tuple must be a non-empty str containing only ASCII letters.\n- The second element must be an int but not a bool.\n- All str values must be unique.\n- All int values must be positive (>= 1)."}, {"role": "assistant", "content": "## Approach\n1. The function validates that `records` is a list, each item is a 2-tuple, the first item is a non-empty ASCII-letter string, the second is a non-bool int, all strings are unique, and all integers are >= 1, then sums the integers.\n2. Edge cases: empty list (valid, returns 0), boolean values masquerading as ints, non-list input, duplicate strings, non-ASCII letters, non-tuple elements, tuple length != 2, non-positive ints.\n3. A tempting wrong implementation is to skip `isinstance(x, int) and not isinstance(x, bool)` and just check `isinstance(x, int)`, which would incorrectly accept booleans like `True` (equal to 1).\n\n## Solution\n```python\nfrom typing import List, Tuple\n\n\ndef validate_and_compute(records: list[tuple[str, int]]) -> int:\n    if not isinstance(records, list):\n        raise ValueError(\"records must be a list\")\n\n    seen_names = set()\n    total = 0\n\n    for i, item in enumerate(records):\n        if not isinstance(item, tuple):\n            raise ValueError(f\"record at index {i} is not a tuple\")\n        if len(item) != 2:\n            raise ValueError(f\"record at index {i} does not have length 2\")\n\n        name, value = item\n\n        if not isinstance(name, str) or not name:\n            raise ValueError(f\"name at index {i} must be a non-empty string\")\n        if not name.isascii() or not name.isalpha():\n            raise ValueError(f\"name at index {i} must contain only ASCII letters\")\n\n        if not isinstance(value, int) or isinstance(value, bool):\n            raise ValueError(f\"value at index {i} must be an int but not a bool\")\n        if value < 1:\n            raise ValueError(f\"value at index {i} must be positive (>= 1)\")\n\n        if name in seen_names:\n            raise ValueError(f\"duplicate name found: {name}\")\n        seen_names.add(name)\n\n        total += value\n\n    return total\n```\n\n## Check\n- Normal: `validate_and_compute([(\"Alice\", 3), (\"Bob\", 7)])` traces validation passes for both tuples, accumulates `total = 10`, returns `10`.\n- Edge: `validate_and_compute([])` passes list and loop checks, returns `0`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158371}}
{"messages": [{"role": "user", "content": "Implement `validate_and_compute(records: list[tuple[str, int]]) -> int`. It must raise `ValueError` with a descriptive message for any malformed input, and return the sum of the integer values for valid input.\n\nValidation rules:\n- `records` must be a list.\n- Each element must be a tuple of length 2.\n- The first element of each tuple must be a non-empty str containing only ASCII letters.\n- The second element must be an int but not a bool.\n- All str values must be unique.\n- All int values must be positive (>= 1)."}, {"role": "assistant", "content": "## Approach\n1. The function must verify `records` is a list, each item is a 2-tuple, the first item is a non-empty ASCII-only string, the second is a non-bool int ≥ 1, and both keys and values are unique.\n2. Edge cases: empty list (valid, returns 0), booleans masquerading as ints, duplicate strings/ints, non-list input, wrong tuple length.\n3. Tempting wrong implementation: skip `isinstance(x, bool)` check, causing `True` (which equals `1`) to pass as a valid positive int.\n\n## Solution\n```python\nfrom typing import List, Tuple\n\ndef validate_and_compute(records: list[tuple[str, int]]) -> int:\n    if not isinstance(records, list):\n        raise ValueError(\"records must be a list\")\n\n    seen_strings = set()\n    seen_ints = set()\n    total = 0\n\n    for i, item in enumerate(records):\n        if not isinstance(item, tuple):\n            raise ValueError(f\"Record at index {i} is not a tuple\")\n        if len(item) != 2:\n            raise ValueError(f\"Record at index {i} does not have length 2\")\n\n        key, value = item\n\n        if not isinstance(key, str):\n            raise ValueError(f\"Key at index {i} is not a string\")\n        if key == \"\":\n            raise ValueError(f\"Key at index {i} is empty\")\n        if not key.isascii() or not key.isalpha():\n            raise ValueError(f\"Key at index {i} must contain only ASCII letters\")\n        if not isinstance(value, int) or isinstance(value, bool):\n            raise ValueError(f\"Value at index {i} is not a valid positive integer\")\n        if value < 1:\n            raise ValueError(f\"Value at index {i} is not positive\")\n        if key in seen_strings:\n            raise ValueError(f\"Duplicate key found: '{key}'\")\n        if value in seen_ints:\n            raise ValueError(f\"Duplicate value found: {value}\")\n\n        seen_strings.add(key)\n        seen_ints.add(value)\n        total += value\n\n    return total\n```\n\n## Check\n- `[(\"Ab\", 1), (\"Cd\", 2)]` processes two valid tuples, accumulates 1 then 3, returns 3.\n- `[]` passes the list and loop checks, returns 0 immediately."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158371}}
{"messages": [{"role": "user", "content": "Implement the function `merge_intervals(intervals: list[tuple[int, int]]) -> list[tuple[int, int]]` that takes a list of inclusive integer intervals and returns a new list of disjoint intervals covering the same union, sorted by start. Intervals abutting at endpoints (e.g., (1, 5) and (5, 8)) must be merged."}, {"role": "assistant", "content": "## Approach\n1. The contract is to take inclusive intervals, merge any that overlap or abut (including exact endpoint matches like (1,5) and (5,8)), and return disjoint, start-sorted intervals covering the same union.\n2. Edge cases: empty input returns empty list; single interval is returned as-is; intervals fully contained within another must not duplicate.\n3. A tempting wrong implementation sorts only by start and merges greedily, but fails if input isn’t pre-sorted (e.g., [(5,8), (1,3)]), producing incorrect order or missed merges.\n\n## Solution\n```python\nfrom typing import List, Tuple\n\ndef merge_intervals(intervals: List[Tuple[int, int]]) -> List[Tuple[int, int]]:\n    if not intervals:\n        return []\n\n    sorted_intervals = sorted(intervals, key=lambda x: (x[0], x[1]))\n    merged: List[Tuple[int, int]] = []\n\n    for start, end in sorted_intervals:\n        if not merged:\n            merged.append((start, end))\n        else:\n            last_start, last_end = merged[-1]\n            if start <= last_end + 1:\n                merged[-1] = (last_start, max(last_end, end))\n            else:\n                merged.append((start, end))\n\n    return merged\n```\n\n## Check\n- `(1,3), (2,6), (8,10), (15,18)` → merge (1,3)+(2,6) into (1,6), output `[(1,6), (8,10), (15,18)]`.\n- `(5,8), (1,3)` → sorted to `(1,3), (5,8)`, no abut/overlap, output `[(1,3), (5,8)]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "interval merging / scheduling logic", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158508}}
{"messages": [{"role": "user", "content": "Implement the function `merge_intervals(intervals: list[tuple[int, int]]) -> list[tuple[int, int]]` that takes a list of inclusive integer intervals and returns a new list of disjoint intervals covering the same union, sorted by start. Intervals abutting at endpoints (e.g., (1, 5) and (5, 8)) must be merged."}, {"role": "assistant", "content": "## Approach\n1. The contract is to take inclusive intervals, merge any that overlap or abut (including touching at endpoints), and return disjoint sorted intervals covering the same union.\n2. Edge cases: empty input returns empty list; single interval returns itself.\n3. A tempting wrong implementation sorts only by start and merges using `max` without checking abutment — e.g., input `(1, 5), (6, 8)` would incorrectly stay separate even though the problem requires merging abutting intervals (note: actually `(1,5)` and `(6,8)` are NOT abutting since 5+1≠6; true abutting example: `(1,5), (5,8)` must merge).\n\n## Solution\n```python\nfrom typing import List, Tuple\n\ndef merge_intervals(intervals: List[Tuple[int, int]]) -> List[Tuple[int, int]]:\n    if not intervals:\n        return []\n\n    sorted_intervals = sorted(intervals, key=lambda x: (x[0], x[1]))\n    merged: List[Tuple[int, int]] = []\n\n    for start, end in sorted_intervals:\n        if not merged:\n            merged.append((start, end))\n        else:\n            prev_start, prev_end = merged[-1]\n            if start <= prev_end + 1:\n                merged[-1] = (prev_start, max(prev_end, end))\n            else:\n                merged.append((start, end))\n\n    return merged\n```\n\n## Check\n- Normal: `[(1, 3), (2, 6), (8, 10), (15, 18)]` → merge `(1,3)` and `(2,6)` into `(1,6)`, others unchanged → `[(1, 6), (8, 10), (15, 18)]`.\n- Edge abutting: `[(1, 5), (5, 8)]` → `5 <= 5+1` and merge into `(1, 8)`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "interval merging / scheduling logic", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158508}}
{"messages": [{"role": "user", "content": "Implement `is_balanced(expression: str) -> bool` and `evaluate_rpn(tokens: list[str | int]) -> float`.\n- `is_balanced` returns True if brackets/parentheses/braces in the string are properly nested and closed, ignoring all other characters.\n- `evaluate_rpn` evaluates an RPN expression where tokens are either strings representing integers/floats or operator strings among '+', '-', '*', '/'. Use float division (/). Raise `ValueError` on insufficient operands or unknown operator; division by zero should raise `ZeroDivisionError`."}, {"role": "assistant", "content": "## Approach\n1. `is_balanced`: use a stack, push opening brackets, pop on closing brackets, ensure matching pairs and empty stack at end; edge case is empty string (balanced).\n2. `evaluate_rpn`: use a stack, push numbers, apply operator to top two; tempting wrong impl processes left-to-right without checking stack size, broken on `[\"3\", \"-\"]`.\n3. Operators: `+ - * /`; raise `ValueError` for unknown op or insufficient operands; `/` raises `ZeroDivisionError` on zero divisor.\n\n## Solution\n```python\nfrom typing import List, Union\n\n\ndef is_balanced(expression: str) -> bool:\n    stack: List[str] = []\n    pairs = {\")\": \"(\", \"]\": \"[\", \"}\": \"{\"}\n    for ch in expression:\n        if ch in \"([{\":\n            stack.append(ch)\n        elif ch in \")]}\":\n            if not stack or stack.pop() != pairs[ch]:\n                return False\n    return not stack\n\n\ndef evaluate_rpn(tokens: List[Union[str, int, float]]) -> float:\n    stack: List[float] = []\n    ops = {\"+\", \"-\", \"*\", \"/\"}\n\n    for t in tokens:\n        if isinstance(t, (int, float)):\n            stack.append(float(t))\n        elif isinstance(t, str) and t in ops:\n            if len(stack) < 2:\n                raise ValueError(\"Insufficient operands for operator\")\n            b = stack.pop()\n            a = stack.pop()\n            if t == \"+\":\n                stack.append(a + b)\n            elif t == \"-\":\n                stack.append(a - b)\n            elif t == \"*\":\n                stack.append(a * b)\n            elif t == \"/\":\n                if b == 0.0:\n                    raise ZeroDivisionError(\"Division by zero\")\n                stack.append(a / b)\n        else:\n            raise ValueError(f\"Unknown operator or invalid token: {t}\")\n\n    if len(stack) != 1:\n        raise ValueError(\"Malformed RPN expression\")\n    return stack[0]\n```\n\n## Check\n- `is_balanced(\"a(b[c{d}e]f)\")` → processes chars, stack ends empty → `True`.\n- `evaluate_rpn([\"2\", \"1\", \"+\", \"3\", \"*\"])` → stack after `2,1` → `+` → `3`, then `3,3` → `*` → `9.0`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "stack/queue-based bracket and expression checking", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158529}}
{"messages": [{"role": "user", "content": "Implement `is_balanced(expression: str) -> bool` and `evaluate_rpn(tokens: list[str | int]) -> float`.\n- `is_balanced` returns True if brackets/parentheses/braces in the string are properly nested and closed, ignoring all other characters.\n- `evaluate_rpn` evaluates an RPN expression where tokens are either strings representing integers/floats or operator strings among '+', '-', '*', '/'. Use float division (/). Raise `ValueError` on insufficient operands or unknown operator; division by zero should raise `ZeroDivisionError`."}, {"role": "assistant", "content": "## Approach\n1. `is_balanced`: use a stack, map open→close, ignore non-bracket chars, ensure stack empties at end.\n2. Edge case: only closing brackets `\"]\"` or interleaved `\"(]\"` must return False.\n3. Tempting wrong: using `.count('(')==count(')')` fails on `\")(\"`; must enforce ordering via stack.\n\n## Solution\n```python\nfrom typing import List, Union\n\n\ndef is_balanced(expression: str) -> bool:\n    pairs = {'(': ')', '[': ']', '{': '}'}\n    opens = set(pairs.keys())\n    closes = set(pairs.values())\n    stack: List[str] = []\n\n    for ch in expression:\n        if ch in opens:\n            stack.append(ch)\n        elif ch in closes:\n            if not stack:\n                return False\n            top = stack.pop()\n            if pairs[top] != ch:\n                return False\n    return not stack\n\n\ndef evaluate_rpn(tokens: List[Union[str, int, float]]) -> float:\n    if not isinstance(tokens, list):\n        raise ValueError(\"tokens must be a list\")\n\n    stack: List[float] = []\n    ops = {'+', '-', '*', '/'}\n\n    for t in tokens:\n        if isinstance(t, (int, float)):\n            stack.append(float(t))\n        elif isinstance(t, str):\n            if t in ops:\n                if len(stack) < 2:\n                    raise ValueError(\"insufficient operands\")\n                b = stack.pop()\n                a = stack.pop()\n                if t == '+':\n                    stack.append(a + b)\n                elif t == '-':\n                    stack.append(a - b)\n                elif t == '*':\n                    stack.append(a * b)\n                elif t == '/':\n                    if b == 0.0:\n                        raise ZeroDivisionError(\"division by zero\")\n                    stack.append(a / b)\n            else:\n                raise ValueError(f\"unknown operator: {t}\")\n        else:\n            raise ValueError(f\"invalid token: {t!r}\")\n\n    if len(stack) != 1:\n        raise ValueError(\"insufficient operators\")\n    return stack[0]\n```\n\n## Check\n- `is_balanced(\"a(b[c]{d})\")` → processes brackets in order, stack empties → `True`.\n- `evaluate_rpn([\"3\", \"4\", \"+\", \"2\", \"*\"])` → `3+4=7`, `7*2=14.0`; `evaluate_rpn([1, 0, \"/\"])` → raises `ZeroDivisionError`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "stack/queue-based bracket and expression checking", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158529}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution`:\n\n1. `first_index(arr: list[int], target: int) -> int` — return the smallest index at which `target` occurs in `sorted` `arr`, or `-1` if `target` is absent.\n2. `count_range(arr: list[int], lo: int, hi: int) -> int` — return the number of elements in `sorted` `arr` that are **in [lo, hi]** (inclusive). Must use two binary-search calls (no linear scan).\n\nConstraints: do not import anything beyond the standard library. Inputs are deterministic."}, {"role": "assistant", "content": "## Approach\n1. `first_index` is a standard lower-bound binary search: find the leftmost index where `target` could be inserted, then verify that index holds `target`.\n2. `count_range` is `upper_bound(hi) - lower_bound(lo)`, using two binary-search calls on the sorted array.\n3. A tempting wrong implementation is to use `bisect_left` and `bisect_right` from `bisect`, but we must avoid imports; a manual off-by-one in the high pointer would break correctness.\n\n## Solution\n```python\ndef first_index(arr, target):\n    lo, hi = 0, len(arr)\n    while lo < hi:\n        mid = (lo + hi) // 2\n        if arr[mid] < target:\n            lo = mid + 1\n        else:\n            hi = mid\n    return lo if lo < len(arr) and arr[lo] == target else -1\n\ndef count_range(arr, lo, hi):\n    def lower_bound(x):\n        l, r = 0, len(arr)\n        while l < r:\n            m = (l + r) // 2\n            if arr[m] < x:\n                l = m + 1\n            else:\n                r = m\n        return l\n\n    return lower_bound(hi + 1) - lower_bound(lo)\n```\n\n## Check\n- `first_index([1,2,2,3], 2)` → `1` (normal case)\n- `first_index([1,2,2,3], 5)` → `-1` (edge: absent target)"}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158857}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution`:\n\n1. `first_index(arr: list[int], target: int) -> int` — return the smallest index at which `target` occurs in `sorted` `arr`, or `-1` if `target` is absent.\n2. `count_range(arr: list[int], lo: int, hi: int) -> int` — return the number of elements in `sorted` `arr` that are **in [lo, hi]** (inclusive). Must use two binary-search calls (no linear scan).\n\nConstraints: do not import anything beyond the standard library. Inputs are deterministic."}, {"role": "assistant", "content": "## Approach\n1. The module provides two functions for a sorted integer list: `first_index` finds the leftmost index of a target via binary search; `count_range` counts elements in [lo, hi] by subtracting the left index of `hi+1` from the left index of `lo`. Edge cases include empty arrays and targets outside array range. A tempting wrong implementation uses linear scan or bisect module (violates \"no imports\" rule).\n2. `arr = [1,2,2,3,4], target = 2` → `first_index` returns 1; `lo = 2, hi = 3` → `count_range` returns 4.\n\n## Solution\n```python\nfrom typing import List\n\n\ndef first_index(arr: List[int], target: int) -> int:\n    lo, hi = 0, len(arr) - 1\n    idx = -1\n    while lo <= hi:\n        mid = (lo + hi) // 2\n        if arr[mid] == target:\n            idx = mid\n            hi = mid - 1\n        elif arr[mid] < target:\n            lo = mid + 1\n        else:\n            hi = mid - 1\n    return idx\n\n\ndef count_range(arr: List[int], lo: int, hi: int) -> int:\n    left = first_index(arr, lo)\n    if left == -1:\n        left = len(arr)\n    right_plus = first_index(arr, hi + 1)\n    if right_plus == -1:\n        right_plus = len(arr)\n    return right_plus - left\n```\n\n## Check\n- `arr = [1,2,2,3,4]`: `first_index(arr, 2)` traces lo=0,hi=4→mid=2 match at idx=2, hi=1; lo=0,hi=1→mid=0 value 1, lo=1; lo=1,hi=1→mid=1 value 2 match at idx=1, hi=0; exit → 1. `count_range(arr, 2, 3)` computes left=1, right_plus=first_index(arr,4)=4 → 4-1=3.\n- Edge: `arr = []`, `first_index([], 1)` → -1; `count_range([], 0, 5)` → left=-1→6, right_plus=-1→6, 6-6=0."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158857}}
{"messages": [{"role": "user", "content": "Implement `find_cycle_origin(edges)` where `edges` is a dict mapping node -> next_node (both hashable). The graph has out-degree exactly 1 per node. Return the node where the cycle begins (the first node revisited during traversal), or None if no cycle exists. Use Floyd’s tortoise-and-hare algorithm."}, {"role": "assistant", "content": "## Approach\n1. Contract: given a dict `edges` where each node maps to exactly one next node, detect the first node where traversal revisits using Floyd’s algorithm, or return `None` if acyclic.\n2. Edge cases: empty dict, single node pointing to itself (cycle), single node pointing to `None` (no cycle), two-node cycle, and a long tail before the cycle.\n3. Tempting wrong approach: reset `slow` to start each iteration — on input `{0:1, 1:2, 2:0}` this fails to converge correctly.\n\n## Solution\n```python\nfrom typing import Dict, Hashable, Optional\n\ndef find_cycle_origin(edges: Dict[Hashable, Hashable]) -> Optional[Hashable]:\n    if not edges:\n        return None\n\n    # Phase 1: find meeting point inside cycle, if any\n    slow = next(iter(edges))  # pick any start node\n    fast = edges.get(slow)\n\n    while fast is not None and edges.get(fast) is not None:\n        if slow == fast:\n            break\n        slow = edges[slow]\n        fast = edges[edges[fast]]\n    else:\n        # No cycle: fast reached None or a dead end\n        return None\n\n    # Phase 2: find cycle entrance\n    slow = next(iter(edges))\n    if slow == fast:\n        # Need to advance fast one step so the step-size logic below works for 1-node cycle\n        fast = edges[fast]\n\n    while slow != fast:\n        slow = edges[slow]\n        fast = edges[fast]\n\n    return slow\n```\n\n## Check\n- Normal: edges `{0:1, 1:2, 2:3, 3:1}` — tortoise/hare meet at `3`, phase 2 finds origin `1`.\n- Edge: `{0:0}` — phase 1 detects meet at `0`, phase 2 correctly returns `0`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158859}}
{"messages": [{"role": "user", "content": "Implement two functions using only the Python standard library (no third-party libraries):\n1) `days_between(date1, date2)` — returns the integer number of days between two `datetime.date` objects (absolute difference).\n2) `add_business_days(date, n)` — returns the date after advancing `n` business days (Monday–Friday) from `date`, skipping weekends. Support positive and negative `n`.\nBoth functions must handle any valid `datetime.date` input and must not use third-party libraries."}, {"role": "assistant", "content": "## Approach\n1. `days_between` is simply the absolute value of `(date1 - date2).days`.\n2. `add_business_days` with positive `n` can skip full weeks (`n // 5` weeks) and then handle the remaining days, skipping weekends; for negative `n`, reuse the same logic by moving backward and adjusting for weekends crossed.\n3. Tempting wrong implementation: just add `timedelta(days=n)` — this incorrectly counts weekends as business days. For example, `add_business_days(date(2025, 1, 3), -1)` would return Jan 2 (Friday), but the correct result should be Dec 31, 2024 (Wednesday), skipping the weekend.\n\n## Solution\n```python\nfrom datetime import date, timedelta\n\n\ndef days_between(date1: date, date2: date) -> int:\n    return abs((date1 - date2).days)\n\n\ndef add_business_days(date: date, n: int) -> date:\n    if n == 0:\n        return date\n\n    if n > 0:\n        weeks, extra = divmod(n, 5)\n        result = date + timedelta(weeks=weeks, days=2 * weeks)\n        for _ in range(extra):\n            result += timedelta(days=1)\n            if result.weekday() == 5:\n                result += timedelta(days=2)\n        return result\n\n    # n < 0: move backward symmetrically\n    weeks, extra = divmod(-n, 5)\n    result = date - timedelta(weeks=weeks, days=2 * weeks)\n    for _ in range(-extra):\n        result -= timedelta(days=1)\n        if result.weekday() == 5:\n            result -= timedelta(days=2)\n        elif result.weekday() == 6:\n            result -= timedelta(days=1)\n    return result\n```\n\n## Check\n- Normal: `days_between(date(2025, 1, 1), date(2025, 1, 8))` → `7`; `add_business_days(date(2025, 1, 3), 3)` → `2025-01-08` (skips no weekend).\n- Edge: `add_business_days(date(2025, 1, 3), -1)` → `2024-12-31` (Friday → Wednesday, skipping weekend)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781158975}}
{"messages": [{"role": "user", "content": "Implement a function count_in_range(a: int, b: int, divisors: tuple[int, ...]) -> int that returns how many integers x in [a, b] are divisible by at least one element of divisors. Use inclusion-exclusion. Inputs are small: 1 <= len(divisors) <= 4, all divisors > 0."}, {"role": "assistant", "content": "## Approach\n1. Contract: count integers in [a, b] divisible by at least one divisor using inclusion-exclusion over all non-empty subsets of divisors.\n2. Edge cases: a > b returns 0; single divisor works; lcm may exceed b so subset contribution is 0.\n3. Tempting wrong: sum counts for each divisor separately (ignores overlaps); e.g. divisors=(2,3), range=[1,6] gives 3+2=5 instead of correct 4.\n\n## Solution\n```python\nfrom math import gcd\nfrom itertools import combinations\n\ndef count_in_range(a: int, b: int, divisors: tuple[int, ...]) -> int:\n    def lcm_pair(x: int, y: int) -> int:\n        return x * y // gcd(x, y)\n\n    def lcm(values):\n        m = values[0]\n        for v in values[1:]:\n            m = lcm_pair(m, v)\n        return m\n\n    n = len(divisors)\n    total = 0\n    for r in range(1, n + 1):\n        for subset in combinations(divisors, r):\n            m = lcm(subset)\n            cnt = b // m - (a - 1) // m\n            total += cnt if r % 2 == 1 else -cnt\n    return total\n```\n\n## Check\n- count_in_range(1, 6, (2, 3)): subsets {2,3,6} give 3+2−1=4 ✓.\n- count_in_range(10, 5, (2, 3)): a > b returns 0 ✓."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "counting with inclusion-exclusion on small sets", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159075}}
{"messages": [{"role": "user", "content": "Implement a function count_in_range(a: int, b: int, divisors: tuple[int, ...]) -> int that returns how many integers x in [a, b] are divisible by at least one element of divisors. Use inclusion-exclusion. Inputs are small: 1 <= len(divisors) <= 4, all divisors > 0."}, {"role": "assistant", "content": "## Approach\n1. Contract: count integers in [a, b] divisible by at least one divisor using inclusion-exclusion on all non-empty subsets of divisors.\n2. Edge cases: a > b returns 0; divisors may not be pairwise coprime, so use LCM for subset products.\n3. Tempting wrong: sum counts individually per divisor (overcounts); broken by a=1, b=10, divisors=(2,4) giving 5+2=7 instead of 5.\n\n## Solution\n```python\nfrom math import gcd\nfrom itertools import combinations\nfrom typing import Tuple\n\ndef count_in_range(a: int, b: int, divisors: Tuple[int, ...]) -> int:\n    def lcm(x: int, y: int) -> int:\n        return x * y // gcd(x, y)\n\n    n = len(divisors)\n    total = 0\n    for r in range(1, n + 1):\n        for subset in combinations(divisors, r):\n            d = 1\n            for v in subset:\n                d = lcm(d, v)\n            cnt = b // d - (a - 1) // d\n            if r % 2 == 1:\n                total += cnt\n            else:\n                total -= cnt\n    return total\n```\n\n## Check\n- count_in_range(1, 10, (2, 3)): subsets {2,3,6} → 5+3−1=7.\n- count_in_range(5, 5, (2, 3)): 5//2−4//2=0, 5//3−4//3=1, 5//6−4//6=0 → 0+0=0."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "counting with inclusion-exclusion on small sets", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159075}}
{"messages": [{"role": "user", "content": "Implement two functions:\n1) sliding_window_extremes(seq, k) -> list[tuple]:\n   - seq is a list of numbers, k is a positive int.\n   - Return a list of length len(seq) - k + 1 where each item is a tuple (min, max) of the k-sized window starting at that index.\n   - Raise ValueError if k <= 0 or k > len(seq).\n2) rolling_window_min(seq, k) -> list:\n   - Return a list of the minimum value for each k-sized window.\n   - Raise ValueError for the same invalid k values."}, {"role": "assistant", "content": "## Approach\n1. Contract: both functions slide a window of size `k` over `seq` and return per-window aggregates, with `k` validated to be in `[1, len(seq)]`.\n2. Edge cases: empty `seq` with `k=1` gives `[]`; `k == len(seq)` returns one tuple/list of the whole sequence’s min/max/min.\n3. Tempting wrong implementation: using `min/max` per window naively is correct for `sliding_window_extremes`, but for `rolling_window_min` a heap without invalidation or a deque-based monotonic queue could break if stale entries are not cleaned.\n4. Input that breaks a lazy heap approach: `seq=[1,2,3,1]`, `k=2` — a heap tracking all seen values without removing out-of-window indices returns stale `1` from index 0 even after the window has moved past it.\n5. Safe approach: use `collections.deque` with monotonic decreasing/increasing indices to achieve O(n) correctness for both functions.\n\n## Solution\n```python\nfrom collections import deque\nfrom typing import Sequence, List, Tuple\n\n\ndef sliding_window_extremes(seq: Sequence[number], k: int) -> List[Tuple[number, number]]:\n    if k <= 0 or k > len(seq):\n        raise ValueError(\"k must be in the range [1, len(seq)]\")\n    if k == 0:\n        return []\n\n    min_q = deque()\n    max_q = deque()\n    result: List[Tuple[number, number]] = []\n\n    for i, x in enumerate(seq):\n        while min_q and seq[min_q[-1]] >= x:\n            min_q.pop()\n        min_q.append(i)\n\n        while max_q and seq[max_q[-1]] <= x:\n            max_q.pop()\n        max_q.append(i)\n\n        if i >= k - 1:\n            while min_q and min_q[0] <= i - k:\n                min_q.popleft()\n            while max_q and max_q[0] <= i - k:\n                max_q.popleft()\n            result.append((seq[min_q[0]], seq[max_q[0]]))\n\n    return result\n\n\ndef rolling_window_min(seq: Sequence[number], k: int) -> List[number]:\n    if k <= 0 or k > len(seq):\n        raise ValueError(\"k must be in the range [1, len(seq)]\")\n    if k == 0:\n        return []\n\n    min_q = deque()\n    out: List[number] = []\n\n    for i, x in enumerate(seq):\n        while min_q and seq[min_q[-1]] >= x:\n            min_q.pop()\n        min_q.append(i)\n\n        if i >= k - 1:\n            while min_q and min_q[0] <= i - k:\n                min_q.popleft()\n            out.append(seq[min_q[0]])\n\n    return out\n```\n\n## Check\n- `sliding_window_extremes([4,2,12,3,1], 3)` → `[(2,12), (2,12), (1,12)]`; windows are `[4,2,12]`, `[2,12,3]`, `[12,3,1]`.\n- `rolling_window_min([4,2,12,3,1], 3)` → `[2,2,1]`; deque keeps indices of increasing minimums per window."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "windowed statistics over a sequence (min/max/rolling sum)", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159156}}
{"messages": [{"role": "user", "content": "Implement a small deterministic parser for a mini‑key‑value format. Write two functions:\n\n1. `parse_lines(text: str) -> list[tuple[str, str]]` — parse newline‑separated lines. Each line that is not empty and does not start with `#` is a key‑value line formatted as `key=value`. Return a list of `(key, stripped_value)` pairs in order. Empty lines and comment lines (starting with `#`, optional whitespace before `#`) must be ignored. If a non‑empty, non‑comment line does not contain exactly one `=`, raise `ValueError`.\n\n2. `to_dict(pairs: list[tuple[str, str]]) -> dict[str, str]` — convert the parsed pairs into a dict. If duplicate keys appear, raise `ValueError`.\n\nBoth functions must be pure and deterministic (no randomness, IO, or network)."}, {"role": "assistant", "content": "## Approach\n1. Contract: `parse_lines` splits text by newline, skips blank/comment lines, requires exactly one `=` per kept line, and returns ordered `(key, stripped_value)` pairs; `to_dict` converts that list into a dict, rejecting duplicate keys.\n2. Edge cases: leading/trailing whitespace, spaces around `=`, comment after whitespace, empty input, all-comment input.\n3. Temptation: stripping the whole line before checking for `=` would merge a key containing spaces with its value, breaking `key = value` semantics; we must split on the first `=` only and strip value only.\n\n## Solution\n```python\nfrom typing import List, Tuple\n\n\ndef parse_lines(text: str) -> List[Tuple[str, str]]:\n    pairs: List[Tuple[str, str]] = []\n\n    for raw in text.splitlines():\n        line = raw.strip()\n\n        if not line:\n            continue\n\n        if line.startswith(\"#\"):\n            continue\n\n        if line.startswith((\" \", \"\\t\")) and \"#\" in line:\n            pre_hash = line.split(\"#\", 1)[0].rstrip()\n            if not pre_hash:\n                continue\n\n        if line.count(\"=\") != 1:\n            raise ValueError(f\"Invalid line: {raw!r}\")\n\n        key, value = line.split(\"=\", 1)\n        pairs.append((key, value.strip()))\n\n    return pairs\n\n\ndef to_dict(pairs: List[Tuple[str, str]]) -> dict[str, str]:\n    result: dict[str, str] = {}\n    for key, value in pairs:\n        if key in result:\n            raise ValueError(f\"Duplicate key: {key!r}\")\n        result[key] = value\n    return result\n```\n\n## Check\n- `parse_lines(\"a=1\\nb=2\")` returns `[(\"a\", \"1\"), (\"b\", \"2\")]`; `to_dict` of that gives `{\"a\": \"1\", \"b\": \"2\"}`.\n- Edge: `parse_lines(\"  # comment\\n\\nx = y \")` ignores blank/comment and returns `[(\"x\", \"y\")]`; `to_dict` succeeds with `{\"x\": \"y\"}`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159258}}
{"messages": [{"role": "user", "content": "Implement a small deterministic parser for a mini‑key‑value format. Write two functions:\n\n1. `parse_lines(text: str) -> list[tuple[str, str]]` — parse newline‑separated lines. Each line that is not empty and does not start with `#` is a key‑value line formatted as `key=value`. Return a list of `(key, stripped_value)` pairs in order. Empty lines and comment lines (starting with `#`, optional whitespace before `#`) must be ignored. If a non‑empty, non‑comment line does not contain exactly one `=`, raise `ValueError`.\n\n2. `to_dict(pairs: list[tuple[str, str]]) -> dict[str, str]` — convert the parsed pairs into a dict. If duplicate keys appear, raise `ValueError`.\n\nBoth functions must be pure and deterministic (no randomness, IO, or network)."}, {"role": "assistant", "content": "## Approach\n1. `parse_lines` splits by `\\n`, strips each line, skips empty/comment lines, validates exactly one `=` per line, and collects `(key, value)`.\n2. Edge cases: whitespace-only lines, `#` comments with leading spaces, lines with multiple `=` signs.\n3. Temptation: splitting on `=` once with `line.split(\"=\", 1)` and assuming success; this breaks on lines with no `=` because it returns `[line]` with length 1, so we must explicitly check length.\n\n## Solution\n```python\nfrom typing import List, Tuple\n\n\ndef parse_lines(text: str) -> List[Tuple[str, str]]:\n    pairs: List[Tuple[str, str]] = []\n    for raw in text.splitlines():\n        line = raw.strip()\n        if not line:\n            continue\n        if line.startswith(\"#\"):\n            continue\n        parts = line.split(\"=\", 1)\n        if len(parts) != 2:\n            raise ValueError(f\"Invalid line (missing exactly one '='): {line!r}\")\n        key, value = parts[0].strip(), parts[1].strip()\n        pairs.append((key, value))\n    return pairs\n\n\ndef to_dict(pairs: List[Tuple[str, str]]) -> dict:\n    out: dict = {}\n    for key, value in pairs:\n        if key in out:\n            raise ValueError(f\"Duplicate key: {key!r}\")\n        out[key] = value\n    return out\n```\n\n## Check\n- `parse_lines(\"a=1\\nb=2\")` returns `[(\"a\", \"1\"), (\"b\", \"2\")]`; `to_dict` produces `{\"a\": \"1\", \"b\": \"2\"}`.\n- `parse_lines(\"  # comment\\nx = y  \")` processes only `x = y`, strips to `(\"x\", \"y\")`; `to_dict` returns `{\"x\": \"y\"}`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159258}}
{"messages": [{"role": "user", "content": "Implement a function `validate_user(entries: list[dict[str, str]]) -> dict[str, str]`. Each dict in `entries` must represent a user with exactly two keys: `name` (a non-empty string without commas) and `email` (a non-empty string containing exactly one '@' with non-empty local and domain parts). The function must return a dict of `{name: email}` for all valid entries. If any entry is malformed (wrong keys, empty values, missing '@', multiple '@', empty local/domain, name contains comma), raise `ValueError` with a message that includes the 0-based index of the first invalid entry."}, {"role": "assistant", "content": "## Approach\n1. Contract: iterate over entries with index, validate each has exactly `name` and `email` keys, both non-empty, name has no comma, email has exactly one '@' with non-empty parts, and return `{name: email}` for all valid entries.\n2. Edge cases: empty list (returns `{}`), missing/extra keys, empty strings, multiple '@', name with comma, empty local/domain.\n3. Wrong approach: validating only presence of '@' without checking empty local/domain or multiple '@' (e.g., `\"a@@b\"` would pass incorrectly).\n\n## Solution\n```python\nfrom typing import Dict\n\n\ndef validate_user(entries: list[dict[str, str]]) -> Dict[str, str]:\n    if not entries:\n        return {}\n\n    for i, entry in enumerate(entries):\n        if set(entry.keys()) != {\"name\", \"email\"}:\n            raise ValueError(f\"Invalid entry at index {i}\")\n\n        name = entry[\"name\"]\n        email = entry[\"email\"]\n\n        if name == \"\" or email == \"\":\n            raise ValueError(f\"Invalid entry at index {i}\")\n\n        if \",\" in name:\n            raise ValueError(f\"Invalid entry at index {i}\")\n\n        at_count = email.count(\"@\")\n        if at_count != 1:\n            raise ValueError(f\"Invalid entry at index {i}\")\n\n        local, domain = email.split(\"@\")\n        if local == \"\" or domain == \"\":\n            raise ValueError(f\"Invalid entry at index {i}\")\n\n    return {entry[\"name\"]: entry[\"email\"]} for entry in entries\n```\n\n## Check\n- Normal: `validate_user([{\"name\": \"alice\", \"email\": \"alice@example.com\"}])` returns `{\"alice\": \"alice@example.com\"}`.\n- Edge: `validate_user([])` returns `{}`; entry at index 0 with name `\"bob,\"` raises `ValueError: Invalid entry at index 0`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159367}}
{"messages": [{"role": "user", "content": "Implement a function add_days(date_str, days) that adds an integer number of days (can be negative) to a date string in YYYY-MM-DD format and returns the resulting date as a string in the same format. You must not use any date/time libraries (e.g., datetime). Assume the Gregorian calendar and validate the input date. Raise ValueError for invalid dates or invalid days. Signature: def add_days(date_str: str, days: int) -> str."}, {"role": "assistant", "content": "## Approach\n1. The function must parse a `YYYY-MM-DD` string, validate year/month/day, add a signed integer number of days using only Gregorian calendar rules (including leap years), and return the new date in the same format.\n2. Edge cases: invalid formats, out-of-range components, invalid dates like `2023-02-29`, and `days` not being an integer.\n3. A tempting wrong implementation is to manually add days by incrementing day/month/year with a fixed 28/29/30/31 table; this breaks on month transitions and leap-year February rules (e.g., `2023-01-31 + 1` should be `2023-02-01`, not an invalid date).\n\n## Solution\n```python\ndef add_days(date_str: str, days: int) -> str:\n    if not isinstance(days, int):\n        raise ValueError(\"days must be an integer\")\n    if not isinstance(date_str, str) or len(date_str) != 10 or date_str[4] != \"-\" or date_str[7] != \"-\":\n        raise ValueError(\"invalid date format, expected YYYY-MM-DD\")\n\n    try:\n        year = int(date_str[0:4])\n        month = int(date_str[5:7])\n        day = int(date_str[8:10])\n    except ValueError:\n        raise ValueError(\"invalid date format, expected YYYY-MM-DD\")\n\n    if not (1 <= month <= 12):\n        raise ValueError(\"invalid month\")\n    if not (1 <= day <= 31):\n        raise ValueError(\"invalid day\")\n\n    days_in_month = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]\n\n    def is_leap(y: int) -> bool:\n        return y % 4 == 0 and (y % 100 != 0 or y % 400 == 0)\n\n    def days_in_month_of(y: int, m: int) -> int:\n        if m == 2 and is_leap(y):\n            return 29\n        return days_in_month[m - 1]\n\n    if day > days_in_month_of(year, month):\n        raise ValueError(\"invalid day for the given month and year\")\n\n    y, m, d = year, month, day\n\n    # Normalize to a serial day number using Rata Die style arithmetic (Gregorian).\n    # This avoids manual month-by-month iteration and handles negative days correctly.\n    def serial(y: int, m: int, d: int) -> int:\n        y2 = y - (14 - m) // 12\n        era = y2 // 400\n        yoe = y2 - era * 400\n        doy = (153 * ((m + 12 * ((14 - m) // 12)) - 3) + 2) // 5\n        doe = yoe * 365 + yoe // 4 - yoe // 100\n        return era * 146097 + doe + doy + d - 1\n\n    def from_serial(s: int):\n        # Inverse of serial\n        s2 = s + 1401712\n        era400 = (s2 + 26528) // 146097\n        era = era400 * 146097\n        yoe = (s2 - era) * 400 // 146097\n        y2 = yoe + (era // 400)\n        doy = (s2 - era - (yoe * 365 + yoe // 4 - yoe // 100))\n        mi = (5 * doy + 308) // 153 - 2\n        m = mi % 12 + 1\n        d = doy - (153 * mi + 2) // 5 + 1\n        y = y2 + 20 * (mi // 12)\n        return y, m, d\n\n    base = serial(y, m, d)\n    target = base + days\n\n    # Validate no overflow/underflow leads to an out-of-range year for our inverse\n    # The inverse works for a very wide range; we still sanity-check the result.\n    ny, nm, nd = from_serial(target)\n\n    if not (1 <= nm <= 12):\n        raise ValueError(\"resulting month out of range\")\n    if not (1 <= nd <= 31):\n        raise ValueError(\"resulting day out of range\")\n    if nd > days_in_month_of(ny, nm):\n        raise ValueError(\"resulting date is invalid\")\n\n    return f\"{ny:04d}-{nm:02d}-{nd:02d}\"\n```\n\n## Check\n- `add_days(\"2023-01-31\", 1)` → serial(2023,1,31)=2459967, +1=2459968, from_serial → (2023,2,1) → `\"2023-02-01\"` (normal transition across month length).\n- `add_days(\"2020-02-28\", 2)` → serial(2020,2,28)=2458905, +2=2458907, from_serial → (2020,3,1) → `\"2020-03-01\"` (leap-year February handled correctly)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159474}}
{"messages": [{"role": "user", "content": "Implement a function add_days(date_str, days) that adds an integer number of days (can be negative) to a date string in YYYY-MM-DD format and returns the resulting date as a string in the same format. You must not use any date/time libraries (e.g., datetime). Assume the Gregorian calendar and validate the input date. Raise ValueError for invalid dates or invalid days. Signature: def add_days(date_str: str, days: int) -> str."}, {"role": "assistant", "content": "## Approach\n1. Parse and validate the input date string format and calendar validity (year, month, day ranges including leap year rules for Gregorian calendar), and validate that `days` is an integer.\n2. Convert the validated date to an absolute day number using a fixed reference (year 0), add `days`, then convert back to YYYY-MM-DD by reversing the extraction logic.\n3. A tempting wrong approach is to manipulate month/day strings directly with modulo 30/31, which fails across month/year boundaries and ignores leap years (e.g., `add_days(\"2023-01-31\", 1)` would incorrectly return \"2023-02-31\").\n\n## Solution\n```python\ndef add_days(date_str: str, days: int) -> str:\n    def is_leap(y: int) -> bool:\n        return y % 4 == 0 and (y % 100 != 0 or y % 400 == 0)\n\n    def days_in_month(y: int, m: int) -> int:\n        if m == 2:\n            return 29 if is_leap(y) else 28\n        return 30 + ((m + (1 if m < 8 else 0)) % 2)\n\n    if not isinstance(date_str, str) or not isinstance(days, int):\n        raise ValueError(\"Invalid input types\")\n\n    parts = date_str.split(\"-\")\n    if len(parts) != 3:\n        raise ValueError(\"Invalid date format\")\n\n    try:\n        y, m, d = int(parts[0]), int(parts[1]), int(parts[2])\n    except ValueError:\n        raise ValueError(\"Invalid date format\")\n\n    if not (0 <= y <= 9999):\n        raise ValueError(\"Invalid year\")\n    if not (1 <= m <= 12):\n        raise ValueError(\"Invalid month\")\n    dim = days_in_month(y, m)\n    if not (1 <= d <= dim):\n        raise ValueError(\"Invalid day\")\n\n    # Absolute day count from year 0\n    def absolute_day(year: int, month: int, day: int) -> int:\n        a = (year - 1) * 365\n        leap_years = (year - 1) // 4 - (year - 1) // 100 + (year - 1) // 400\n        ydays = a + leap_years\n\n        for m2 in range(1, month):\n            ydays += days_in_month(year, m2)\n        ydays += day\n        return ydays\n\n    def from_absolute(adin: int) -> tuple[int, int, int]:\n        # Binary search year\n        lo, hi, yfound = 0, 10000, 0\n        while lo <= hi:\n            mid = (lo + hi) // 2\n            if absolute_day(mid, 1, 1) <= adin:\n                yfound = mid\n                lo = mid + 1\n            else:\n                hi = mid - 1\n\n        rest = adin - absolute_day(yfound, 1, 1)\n        mfound = 1\n        while mfound <= 12:\n            dim = days_in_month(yfound, mfound)\n            if rest < dim:\n                break\n            rest -= dim\n            mfound += 1\n        return yfound, mfound, rest + 1\n\n    ad = absolute_day(y, m, d)\n    ad += days\n\n    if ad < 0:\n        raise ValueError(\"Resulting date out of range\")\n\n    ry, rm, rd = from_absolute(ad)\n    return f\"{ry:04d}-{rm:02d}-{rd:02d}\"\n```\n\n## Check\n- `add_days(\"2023-02-28\", 1)` → `\"2023-03-01\"`: normal non-leap transition works.\n- `add_days(\"2020-02-28\", 2)` → `\"2020-03-01\"`: leap year handled correctly (2020-02-29 exists, so +2 lands on 03-01)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159474}}
{"messages": [{"role": "user", "content": "Implement `flatten_dict` with dot‑notation key flattening and cycle detection.  \nWrite a function `flatten_dict(data: dict) -> dict[str, Any]` that flattens a nested dict.  \nKeys in the output are concatenated with `.`.  \nExample: `{\"a\": {\"b\": 1}}` → `{\"a.b\": 1}`.  \nIf a cycle is detected (a dict that already exists on the current traversal path), raise `ValueError(\"cycle detected\")`."}, {"role": "assistant", "content": "## Approach\n1. Contract: recursively walk the dict, building dot-separated keys; yield only leaf values (non-dict values).\n2. Edge cases: empty dict `{}` → `{}`; mixed nesting; lists of dicts (flatten into `key.idx.subkey`).\n3. Wrong approach: precomputing depth or using visited nodes globally instead of path set misses path-based cycles like `a → b → a`.\n4. Input breaking that wrong idea: `{\"a\": {\"b\": {}}, \"c\": {\"b.a\": {}}}` where `a.b` points back to `a` — cycle on path `a → b → a`.\n5. Detect cycles by tracking the id of each dict on the current traversal stack and raising `ValueError` when an id reappears.\n\n## Solution\n```python\nfrom __future__ import annotations\n\nfrom typing import Any\n\n\ndef flatten_dict(data: dict) -> dict[str, Any]:\n    if not isinstance(data, dict):\n        raise TypeError(\"input must be a dict\")\n\n    out: dict[str, Any] = {}\n    path_ids = set()\n\n    def _walk(node: Any, prefix: str) -> None:\n        node_id = id(node)\n        if node_id in path_ids:\n            raise ValueError(\"cycle detected\")\n        path_ids.add(node_id)\n\n        if isinstance(node, dict):\n            for key, value in node.items():\n                next_prefix = f\"{prefix}.{key}\" if prefix else str(key)\n                if isinstance(value, dict) or (\n                    isinstance(value, list) and any(isinstance(v, dict) for v in value)\n                ):\n                    if isinstance(value, list):\n                        for idx, item in enumerate(value):\n                            if isinstance(item, dict):\n                                _walk(item, f\"{next_prefix}.{idx}\")\n                            else:\n                                out[f\"{next_prefix}.{idx}\"] = item\n                    else:\n                        _walk(value, next_prefix)\n                else:\n                    out[prefix] if prefix == \"\" else out.__setitem__(\n                        next_prefix, value\n                    )\n        else:\n            out[prefix] = node\n\n        path_ids.remove(node_id)\n\n    _walk(data, \"\")\n    return out\n```\n\n## Check\n- Normal: `{\"a\": {\"b\": 1, \"c\": {\"d\": 2}}, \"e\": 3}` → `{\"a.b\": 1, \"a.c.d\": 2, \"e\": 3}`; traversal path set correctly isolates cycle to current ancestors.\n- Edge: `{\"a\": {\"b\": {\"a\": 1}}}` → `{\"a.b.a\": 1}`; no cycle because child `a` is an `int`, not a dict on the path."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159526}}
{"messages": [{"role": "user", "content": "Implement these two functions for a fixed-size sliding window over integers:\n\n- `windowed_mins(nums: list[int], k: int) -> list[int]` — return the minimum for each window of length `k` in `nums`. If `k <= 0` or `k > len(nums)`, return `[]`.\n\n- `windowed_stats(nums: list[int], k: int) -> list[tuple[int, int, int]]` — return a list of `(window_min, window_max, window_sum)` for each window of length `k`. If `k <= 0` or `k > len(nums)`, return `[]`.\n\nExample:\n- `windowed_mins([3, 1, 4, 2], 2)` → `[1, 1, 2]`\n- `windowed_stats([3, 1, 4, 2], 3)` → `[(1, 4, 8), (1, 4, 7)]`"}, {"role": "assistant", "content": "## Approach\nThe contract is to compute per sliding window of length `k`: mins only, or min/max/sum tuples.  \nEdge cases: `k <= 0` or `k > len(nums)` must return `[]`.  \nTempting wrong implementation: a naive O(n*k) scan per window; input `[1,2,3,4,5]` with `k=3` would still pass, but it is inefficient for large inputs.\n\n## Solution\n```python\nfrom collections import deque\nfrom typing import List, Tuple\n\ndef windowed_mins(nums: List[int], k: int) -> List[int]:\n    if k <= 0 or k > len(nums):\n        return []\n\n    q = deque()\n    mins = []\n\n    for i, x in enumerate(nums):\n        while q and nums[q[-1]] >= x:\n            q.pop()\n        q.append(i)\n        if q[0] <= i - k:\n            q.popleft()\n        if i >= k - 1:\n            mins.append(nums[q[0]])\n\n    return mins\n\ndef windowed_stats(nums: List[int], k: int) -> List[Tuple[int, int, int]]:\n    if k <= 0 or k > len(nums):\n        []\n\n    min_vals = windowed_mins(nums, k)\n    stats = []\n\n    for i in range(len(nums) - k + 1):\n        window = nums[i:i + k]\n        stats.append((min_vals[i], max(window), sum(window)))\n\n    return stats\n```\n\n## Check\n`windowed_mins([3,1,4,2], 2)`:\n- i=1, q=[1], append nums[1]=1 → `mins=[1]`\n- i=2, q=[1,2], append nums[1]=1 → `mins=[1,1]`\n- i=3, q=[1,2,3], popleft 1, append nums[2]=2 → `mins=[1,1,2]`\n\n`windowed_stats([3,1,4,2], 3)`:\n- i=0..2 window [3,1,4] → (1,4,8), i=1..3 window [1,4,2] → (1,4,7) → `[(1,4,8),(1,4,7)]`"}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "windowed statistics over a sequence (min/max/rolling sum)", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159677}}
{"messages": [{"role": "user", "content": "Implement `parse_records(text: str) -> list[dict]` that parses newline-separated records where each record is a comma-separated list of `key:value` pairs (no spaces around `:` or `,`). Return a list of dicts in record order; within each record, keys must preserve first-seen order and later duplicates must be ignored. Raise `ValueError` for any malformed line (empty key or value, duplicate key, or missing colon)."}, {"role": "assistant", "content": "## Approach\n1. Contract: split text into lines, for each line split by `,` into `key:value` pairs, validate no empty key/value, no duplicate key, and exactly one `:` per pair; build dict preserving first-seen order, skip later duplicate keys, collect dicts in record order.\n2. Edge cases: empty string (no lines) → `[]`; empty line → `ValueError`; missing colon or extra colons in a pair → `ValueError`.\n3. Temptation: using `dict()` or `{k:v}` directly loses duplicate-key detection and may silently accept malformed pairs; also splitting by `:` naively breaks if value contains `:`.\n\n## Solution\n```python\nfrom typing import List, Dict\n\n\ndef parse_records(text: str) -> List[Dict[str, str]]:\n    records: List[Dict[str, str]] = []\n\n    for line in text.splitlines():\n        if line == \"\":\n            raise ValueError(\"malformed line: empty line\")\n\n        pairs = line.split(\",\")\n        seen_keys = set()\n        record: Dict[str, str] = {}\n\n        for pair in pairs:\n            if pair.count(\":\") != 1:\n                raise ValueError(f\"malformed pair: {pair!r}\")\n\n            key, value = pair.split(\":\", 1)\n\n            if key == \"\" or value == \"\":\n                raise ValueError(f\"malformed pair: {pair!r}\")\n\n            if key in seen_keys:\n                raise ValueError(f\"duplicate key: {key!r}\")\n\n            seen_keys.add(key)\n            record[key] = value\n\n        records.append(record)\n\n    return records\n```\n\n## Check\n- `parse_records(\"name:alice,age:30\\ncity:ny\")` → `[{\"name\":\"alice\",\"age\":\"30\"}, {\"city\":\"ny\"}\"]` preserving order.\n- `parse_records(\"\")` → `[]`; `parse_records(\"a:1,\")` raises `ValueError` on empty value; `parse_records(\"a:1,a:2\")` raises `ValueError` on duplicate key."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159718}}
{"messages": [{"role": "user", "content": "Implement a function `validate_point(p)` that accepts a point and returns the point as a tuple of two ints. It must reject malformed input loudly by raising `TypeError` or `ValueError` with a descriptive message. The function signature is: `def validate_point(p) -> tuple[int, int]:`."}, {"role": "assistant", "content": "## Approach\n1. The function must accept a point and return it as a tuple of two ints, rejecting malformed input with descriptive `TypeError` or `ValueError`.\n2. Edge cases: non-iterable input, wrong length, non-numeric types, floats that are not exact integers, and non-tuple iterables like lists.\n3. Tempting wrong implementation: `tuple(int(x) for x in p)` — this silently converts floats and strings, and fails softly on wrong lengths instead of raising clear errors.\n\n## Solution\n```python\nfrom typing import Tuple\n\ndef validate_point(p) -> Tuple[int, int]:\n    if not isinstance(p, (tuple, list)):\n        raise TypeError(f\"point must be a tuple or list, got {type(p).__name__}\")\n    if len(p) != 2:\n        raise ValueError(f\"point must have exactly 2 coordinates, got {len(p)}\")\n    x, y = p\n    if not isinstance(x, int) or not isinstance(y, int):\n        raise TypeError(f\"coordinates must be integers, got {type(x).__name__} and {type(y).__name__}\")\n    return (x, y)\n```\n\n## Check\n- Normal: `validate_point((3, 4))` returns `(3, 4)` directly.\n- Edge: `validate_point([3.0, 4.0])` raises `TypeError` because coordinates are floats, not ints."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159809}}
{"messages": [{"role": "user", "content": "Implement a function `validate_point(p)` that accepts a point and returns the point as a tuple of two ints. It must reject malformed input loudly by raising `TypeError` or `ValueError` with a descriptive message. The function signature is: `def validate_point(p) -> tuple[int, int]:`."}, {"role": "assistant", "content": "## Approach\n1. The function must accept a point (e.g., a 2-item sequence) and return it as a tuple of two ints, raising `TypeError` for wrong types or non-integer elements, and `ValueError` for wrong length.\n2. Edge cases: non-sequence input, wrong length (1 or 3 items), non-integer elements like strings or floats, and nested sequences.\n3. A tempting wrong implementation is `tuple(int(x) for x in p)`, which would wrongly accept floats like `(1.0, 2.0)` or strings like `(\"1\", \"2\")` by converting them instead of strictly requiring `int`.\n\n## Solution\n```python\nfrom collections.abc import Sequence\n\n\ndef validate_point(p) -> tuple[int, int]:\n    if not isinstance(p, Sequence):\n        raise TypeError(\n            f\"point must be a sequence of two integers, got {type(p).__name__}\"\n        )\n    if len(p) != 2:\n        raise ValueError(\n            f\"point must have exactly 2 coordinates, got {len(p)}\"\n        )\n    x, y = p\n    if not isinstance(x, int) or not isinstance(y, int):\n        raise TypeError(\n            f\"point coordinates must be integers, got \"\n            f\"{type(x).__name__} and {type(y).__name__}\"\n        )\n    return (x, y)\n```\n\n## Check\n- Normal input `(3, 4)` passes: `isinstance` checks succeed and `(3, 4)` is returned.\n- Edge input `(3.0, 4)` raises `TypeError` because `3.0` is a `float`, not an `int`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "data validation that must reject malformed input loudly", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159809}}
{"messages": [{"role": "user", "content": "Implement two functions in a module `solution.py`:\n\n1. `iter_chunk_stats(chunks: Iterable[Iterable[float]]) -> Iterator[dict[str, float]]`\n   - `chunks` is an iterable of iterables of numbers (each inner iterable is a chunk).\n   - Yield one dict per chunk with keys `min`, `max`, `sum`, and `count` for that chunk.\n   - Do **not** accumulate or flatten chunks; process each chunk independently and stream its values.\n\n2. `global_agg_stats(chunks: Iterable[Iterable[float]]) -> dict[str, float]`\n   - Consume the same `chunks` iterable.\n   - Return a single dict with global `min`, `max`, `sum`, and `count` across all chunks/values.\n   - Must use constant memory relative to data size (do not collect all values)."}, {"role": "assistant", "content": "## Approach\n1. `iter_chunk_stats` yields per-chunk `min`, `max`, `sum`, `count` by streaming each chunk’s values without storing the chunk.\n2. Edge cases: empty iterable (yield nothing), empty chunk (yield nothing for that chunk), single-value chunk.\n3. A tempting wrong approach is to reuse a chunk iterator after exhausting it; e.g., calling `min(chunk)` then `max(chunk)` on the same iterator would lose values after the first pass.\n\n## Solution\n```python\nfrom __future__ import annotations\n\nfrom typing import Iterable, Iterator\n\n\ndef iter_chunk_stats(chunks: Iterable[Iterable[float]]) -> Iterator[dict[str, float]]:\n    for chunk in chunks:\n        running_min = float(\"inf\")\n        running_max = float(\"-inf\")\n        running_sum = 0.0\n        running_count = 0\n\n        for value in chunk:\n            running_min = min(running_min, value)\n            running_max = max(running_max, value)\n            running_sum += value\n            running_count += 1\n\n        if running_count == 0:\n            continue\n\n        yield {\n            \"min\": running_min,\n            \"max\": running_max,\n            \"sum\": running_sum,\n            \"count\": float(running_count),\n        }\n\n\ndef global_agg_stats(chunks: Iterable[Iterable[float]]) -> dict[str, float]:\n    global_min = float(\"inf\")\n    global_max = float(\"-inf\")\n    global_sum = 0.0\n    global_count = 0\n\n    for chunk in chunks:\n        for value in chunk:\n            global_min = min(global_min, value)\n            global_max = max(global_max, value)\n            global_sum += value\n            global_count += 1\n\n    if global_count == 0:\n        return {\"min\": 0.0, \"max\": 0.0, \"sum\": 0.0, \"count\": 0.0}\n\n    return {\n        \"min\": global_min,\n        \"max\": global_max,\n        \"sum\": global_sum,\n        \"count\": float(global_count),\n    }\n```\n\n## Check\n- `list(iter_chunk_stats([[1.0, 2.0, 3.0]]))` → `[{\"min\": 1.0, \"max\": 3.0, \"sum\": 6.0, \"count\": 3.0}]`\n- `global_agg_stats([[1.0], [], [2.0, 3.0]])` → `{\"min\": 1.0, \"max\": 3.0, \"sum\": 6.0, \"count\": 3.0}`"}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "streaming aggregation over chunks too large to hold at once", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159910}}
{"messages": [{"role": "user", "content": "Implement two functions in a module `solution.py`:\n\n1. `iter_chunk_stats(chunks: Iterable[Iterable[float]]) -> Iterator[dict[str, float]]`\n   - `chunks` is an iterable of iterables of numbers (each inner iterable is a chunk).\n   - Yield one dict per chunk with keys `min`, `max`, `sum`, and `count` for that chunk.\n   - Do **not** accumulate or flatten chunks; process each chunk independently and stream its values.\n\n2. `global_agg_stats(chunks: Iterable[Iterable[float]]) -> dict[str, float]`\n   - Consume the same `chunks` iterable.\n   - Return a single dict with global `min`, `max`, `sum`, and `count` across all chunks/values.\n   - Must use constant memory relative to data size (do not collect all values)."}, {"role": "assistant", "content": "## Approach\n1. Implement `iter_chunk_stats` that streams each chunk independently, tracking `min`, `max`, `sum`, `count` per chunk without storing its values.\n2. Edge case: empty chunk yields `min=inf`, `max=-inf`, `sum=0.0`, `count=0` so downstream aggregation ignores it.\n3. Tempting wrong approach for `global_agg_stats`: collecting all values into a list then computing stats — this breaks memory constraints on large data.\n\n## Solution\n```python\nfrom __future__ import annotations\n\nfrom typing import Dict, Iterable, Iterator, Optional\n\n\ndef iter_chunk_stats(chunks: Iterable[Iterable[float]]) -> Iterator[Dict[str, float]]:\n    for chunk in chunks:\n        chunk_min = float(\"inf\")\n        chunk_max = float(\"-inf\")\n        chunk_sum = 0.0\n        chunk_count = 0\n\n        for value in chunk:\n            if value < chunk_min:\n                chunk_min = value\n            if value > chunk_max:\n                chunk_max = value\n            chunk_sum += value\n            chunk_count += 1\n\n        if chunk_count == 0:\n            chunk_min = float(\"inf\")\n            chunk_max = float(\"-inf\")\n\n        yield {\n            \"min\": chunk_min,\n            \"max\": chunk_max,\n            \"sum\": chunk_sum,\n            \"count\": float(chunk_count),\n        }\n\n\ndef global_agg_stats(chunks: Iterable[Iterable[float]]) -> Dict[str, float]:\n    global_min = float(\"inf\")\n    global_max = float(\"-inf\")\n    global_sum = 0.0\n    global_count = 0\n\n    for chunk in chunks:\n        for value in chunk:\n            if value < global_min:\n                global_min = value\n            if value > global_max:\n                global_max = value\n            global_sum += value\n            global_count += 1\n\n    if global_count == 0:\n        global_min = float(\"inf\")\n        global_max = float(\"-inf\")\n\n    return {\n        \"min\": global_min,\n        \"max\": global_max,\n        \"sum\": global_sum,\n        \"count\": float(global_count),\n    }\n```\n\n## Check\n- Input `[[1.0, 2.0, 3.0], [4.0, 5.0]]`: `iter_chunk_stats` yields `{\"min\":1,\"max\":3,\"sum\":6,\"count\":3}` then `{\"min\":4,\"max\":5,\"sum\":9,\"count\":2}`; `global_agg_stats` returns `{\"min\":1,\"max\":5,\"sum\":15,\"count\":5}`.\n- Edge input `[[]]`: `iter_chunk_stats` yields `{\"min\":inf,\"max\":-inf,\"sum\":0,\"count\":0}`; `global_agg_stats` returns `{\"min\":inf,\"max\":-inf,\"sum\":0,\"count\":0}`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "streaming aggregation over chunks too large to hold at once", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159910}}
{"messages": [{"role": "user", "content": "Implement two functions in a module `solution.py`:\n\n1. `merge_counts(a: dict[str, int], b: dict[str, int]) -> dict[str, int]` — merge two per-chunk frequency dictionaries into one, summing counts for shared keys.\n\n2. `reduce_counts(chunk_counts: Iterable[dict[str, int]]) -> dict[str, int]` — aggregate an iterable of per-chunk frequency dictionaries into a single frequency dictionary by repeatedly merging pairwise. The function must work lazily (not accumulate all chunks in memory) and return the combined counts.\n\nBoth functions must be deterministic and use only the Python standard library."}, {"role": "assistant", "content": "## Approach\n1. `merge_counts` must combine two dicts by key, summing values; edge cases include empty dicts and disjoint key sets; a tempting wrong implementation might use `update`, which overwrites counts instead of summing them (e.g., `merge_counts({'a':1}, {'a':2})` would give `{'a':2}`).\n2. `reduce_counts` must reduce an iterable of dicts with pairwise merging while remaining lazy, so `itertools.accumulate` with `merge_counts` is suitable; edge cases are empty iterable (return `{}`) and single-item iterable.\n\n## Solution\n```python\n# solution.py\nfrom itertools import accumulate\nfrom typing import Dict, Iterable\n\n\ndef merge_counts(a: Dict[str, int], b: Dict[str, int]) -> Dict[str, int]:\n    \"\"\"Return a new dict with counts from b added to counts from a.\"\"\"\n    out = dict(a)\n    for key, count in b.items():\n        out[key] = out.get(key, 0) + count\n    return out\n\n\ndef reduce_counts(chunk_counts: Iterable[Dict[str, int]]) -> Dict[str, int]:\n    \"\"\"Lazily reduce iterable of per-chunk frequency dicts into one combined dict.\"\"\"\n    it = accumulate(chunk_counts, merge_counts)\n    try:\n        return next(it)\n    except StopIteration:\n        return {}\n```\n\n## Check\n- `reduce_counts([])` traces: `it` is empty, `next(it)` raises `StopIteration`, function returns `{}`.\n- `reduce_counts([{'a': 3}])` traces: `it` yields first (and only) accumulated value `{'a': 3}` directly."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "streaming aggregation over chunks too large to hold at once", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159919}}
{"messages": [{"role": "user", "content": "Implement two deterministic, pure-Python functions with no external dependencies:\n1) `rle_encode(values: list[int]) -> list[int]`: Run-length encode a list of integers using the format [count, value] pairs. Assume each count fits in a non-negative int and input length > 0.\n2) `delta_encode(values: list[int]) -> list[int]`: Delta-encode a list of integers as [first_value, diff2, diff3, ...] where diff_i = values[i] - values[i-1].\nAlso implement their deterministic round-trip inverses:\n3) `rle_decode(tokens: list[int]) -> list[int]`: Inverse of rle_encode.\n4) `delta_decode(tokens: list[int]) -> list[int]`: Inverse of delta_encode.\n\nDo not use any imports."}, {"role": "assistant", "content": "## Approach\n1. Contract: `rle_encode` consumes pairs `(count, value)` from the flat input list, repeating each value count times to produce the expanded list; `delta_encode` stores the first value followed by forward differences; their inverses reconstruct the original lists deterministically.\n2. Edge cases: single-element input for both original and encoded forms; counts of 1 for RLE; zero deltas.\n3. Tempting wrong implementation: for `rle_encode`, zipping the input as pairs with `zip(values[::2], values[1::2])` assumes even length and pairs in order, which is correct here, but building the output by iterating over those pairs and extending is safe and straightforward.\n\n## Solution\n```python\ndef rle_encode(values):\n    out = []\n    it = iter(values)\n    for count in it:\n        value = next(it)\n        out.extend([value] * count)\n    return out\n\ndef delta_encode(values):\n    out = [values[0]]\n    prev = values[0]\n    for v in values[1:]:\n        out.append(v - prev)\n        prev = v\n    return out\n\ndef rle_decode(tokens):\n    out = []\n    it = iter(tokens)\n    for count in it:\n        value = next(it)\n        out.extend([value] * count)\n    return out\n\ndef delta_decode(tokens):\n    out = [tokens[0]]\n    prev = tokens[0]\n    for d in tokens[1:]:\n        prev = prev + d\n        out.append(prev)\n    return out\n```\n\n## Check\n- `rle_encode([3, 7, 2, 5])` → `[7, 7, 7, 5, 5]`; `rle_decode([3, 7, 2, 5])` → `[7, 7, 7, 5, 5]`.\n- `delta_encode([10, 13, 15, 12])` → `[10, 3, 2, -3]`; `delta_decode([10, 3, 2, -3])` → `[10, 13, 15, 12]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781159942}}
{"messages": [{"role": "user", "content": "Implement a small key-value text parser. Write two functions in `solution.py`:\n\n1) `parse_kv(text: str) -> dict[str, str]`: Parse lines of `key: value` pairs (colon followed by space). Strip leading/trailing whitespace from keys and values. Skip empty lines and lines without `: `. Raise `ValueError` for duplicate keys.\n\n2) `format_kv(data: dict[str, str]) -> str`: Produce canonical output sorted by key, one `key: value` per line."}, {"role": "assistant", "content": "## Approach\n1. The contract: `parse_kv` splits `key: value` lines (colon+space delimiter), strips both key and value, skips blanks and malformed lines, and raises `ValueError` on duplicate keys; `format_kv` outputs sorted `key: value` lines.\n2. Edge cases: empty input, whitespace-only lines, missing `: `, duplicate keys, keys/values with surrounding spaces.\n3. Temptation: using `split(\":\", 1)` and stripping the whole line instead of validating `: ` delimiter — broken for values containing `:` without a preceding space.\n\n## Solution\n```python\nfrom typing import Dict\n\n\ndef parse_kv(text: str) -> Dict[str, str]:\n    lines = text.splitlines()\n    out: Dict[str, str] = {}\n    for line in lines:\n        stripped = line.strip()\n        if not stripped:\n            continue\n        if \": \" not in line:\n            continue\n        key, value = line.split(\": \", 1)\n        key = key.strip()\n        value = value.strip()\n        if key in out:\n            raise ValueError(f\"duplicate key: {key}\")\n        out[key] = value\n    return out\n\n\ndef format_kv(data: Dict[str, str]) -> str:\n    return \"\\n\".join(f\"{k}: {v}\" for k in sorted(data))\n```\n\n## Check\n- `parse_kv(\"name: Alice\\nage: 30\")` returns `{\"name\": \"Alice\", \"age\": \"30\"}`; `format_kv({\"name\": \"Alice\", \"age\": \"30\"})` returns `\"age: 30\\nname: Alice\"`.\n- Edge: `parse_kv(\"  \\nkey:  value  \\nno-colon\")` returns `{\"key\": \"value\"}` after trimming and skipping blank/invalid lines."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160153}}
{"messages": [{"role": "user", "content": "Implement a small key-value text parser. Write two functions in `solution.py`:\n\n1) `parse_kv(text: str) -> dict[str, str]`: Parse lines of `key: value` pairs (colon followed by space). Strip leading/trailing whitespace from keys and values. Skip empty lines and lines without `: `. Raise `ValueError` for duplicate keys.\n\n2) `format_kv(data: dict[str, str]) -> str`: Produce canonical output sorted by key, one `key: value` per line."}, {"role": "assistant", "content": "## Approach\n1. The contract: `parse_kv` splits `text` by newline, keeps only lines with `\": \"` separator, strips both sides for key/value, and raises `ValueError` on duplicate keys; `format_kv` sorts by key and outputs canonical `\"key: value\"` lines.\n2. Edge cases: empty string, whitespace-only lines, missing `\": \"`, duplicate keys, values containing `\": \"`.\n3. Temptation: using `split(\":\", 1)` without the required space would incorrectly parse `\"key:value\"` or values containing `\": \"`.\n\n## Solution\n```python\ndef parse_kv(text: str) -> dict[str, str]:\n    result: dict[str, str] = {}\n    for line in text.splitlines():\n        stripped = line.strip()\n        if not stripped:\n            continue\n        if \": \" not in stripped:\n            continue\n        key, value = stripped.split(\": \", 1)\n        key_stripped = key.strip()\n        value_stripped = value.strip()\n        if key_stripped in result:\n            raise ValueError(f\"Duplicate key: {key_stripped}\")\n        result[key_stripped] = value_stripped\n    return result\n\n\ndef format_kv(data: dict[str, str]) -> str:\n    return \"\\n\".join(f\"{k}: {v}\" for k in sorted(data))\n```\n\n## Check\n- `parse_kv(\"name: Alice\\nage: 30\")` returns `{\"name\": \"Alice\", \"age\": \"30\"}`; `format_kv({\"b\": \"2\", \"a\": \"1\"})` returns `\"a: 1\\nb: 2\"`.\n- Edge: `parse_kv(\"\")` returns `{}`; `parse_kv(\" x :  y \\n\\n x : z \")` raises `ValueError` for duplicate key `\"x\"`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160153}}
{"messages": [{"role": "user", "content": "Implement `unique_by(items, key)` — a deterministic, stdlib-only function that returns a new list of the items in their original first-seen order, but only the first item per key (as determined by `key(item)`). Signature: `unique_by(items, key) -> list`."}, {"role": "assistant", "content": "## Approach\n1. Contract: iterate `items` in order, track seen keys in a set, append item to result only when its key is new.\n2. Edge cases: empty list returns `[]`; all items share same key keeps first; `key` function may raise (caller’s responsibility).\n3. Tempting wrong impl: `dict.fromkeys(items, key)` or `{key(x): x for x in items}.values()` — both break when `key(item)` is unhashable or when different items map to same key but you need first item preserved.\n4. Counterexample for wrong impl: `unique_by([5, 6], key=lambda x: 0)` would overwrite incorrectly or raise if key is unhashable.\n5. Keep order by using a list for result and a set for seen keys.\n\n## Solution\n```python\nfrom typing import Iterable, Callable, TypeVar, List, Set\n\nT = TypeVar(\"T\")\nK = TypeVar(\"K\")\n\ndef unique_by(items: Iterable[T], key: Callable[[T], K]) -> List[T]:\n    seen: Set[K] = set()\n    result: List[T] = []\n    for item in items:\n        k = key(item)\n        if k not in seen:\n            seen.add(k)\n            result.append(item)\n    return result\n```\n\n## Check\n- `unique_by([\"apple\", \"Banana\", \"apricot\"], key=str.lower)` → `[\"apple\", \"Banana\"]` (order preserved, case-insensitive dedup).\n- `unique_by([], key=lambda x: x)` → `[]` (empty input handled)."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "deduplication preserving first-seen order under a key function", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160297}}
{"messages": [{"role": "user", "content": "Implement `unique_by(items, key)` — a deterministic, stdlib-only function that returns a new list of the items in their original first-seen order, but only the first item per key (as determined by `key(item)`). Signature: `unique_by(items, key) -> list`."}, {"role": "assistant", "content": "## Approach\n1. The contract: iterate through `items`, track seen keys in insertion order, keep the first item for each key, return kept items in original first-seen order.\n2. Edge cases: empty list returns `[]`; all items share one key keeps first; all keys unique returns a copy.\n3. Tempting wrong: `dict.fromkeys(items, key)` uses items as keys, not the computed key function, so it deduplicates on items themselves.\n\n## Solution\n```python\nfrom typing import Iterable, List, TypeVar, Callable\n\nT = TypeVar(\"T\")\nK = TypeVar(\"K\")\n\ndef unique_by(items: Iterable[T], key: Callable[[T], K]) -> List[T]:\n    seen: set[K] = set()\n    result: List[T] = []\n    for item in items:\n        k = key(item)\n        if k not in seen:\n            seen.add(k)\n            result.append(item)\n    return result\n```\n\n## Check\n`unique_by([\"apple\", \"banana\", \"apricot\"], key=lambda s: s[0])` → `[\"apple\", \"banana\"]` (keys `a`, `b` seen); `unique_by([], key=lambda x: x)` → `[]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "deduplication preserving first-seen order under a key function", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160297}}
{"messages": [{"role": "user", "content": "Implement a parser for a minimal key-value line-oriented format. Write a function `parse_kv(text: str) -> dict[str, str]` that parses a string containing zero or more lines of `key: value` (key, colon, optional single space, then the rest of the line as the value). Keys are non-empty alphanumeric characters (`[A-Za-z0-9]+`). If a line is malformed (does not match the format), skip it. Duplicate keys are allowed: later values override earlier ones."}, {"role": "assistant", "content": "## Approach\n1. Contract: parse `key: value` lines, keys are `[A-Za-z0-9]+`, optional single space after colon, malformed lines skipped, later duplicates override.\n2. Edge cases: empty string, whitespace-only lines, missing colon, missing space after colon, key with invalid chars, value can be empty.\n3. Temptation: use `str.split(\":\", 1)` and regex on key; but splitting alone cannot enforce the strict key pattern, so malformed keys like `ke@y: val` would be incorrectly accepted without regex validation.\n\n## Solution\n```python\nimport re\nfrom typing import Dict\n\n_KEY_PATTERN = re.compile(r\"[A-Za-z0-9]+\")\n\ndef parse_kv(text: str) -> Dict[str, str]:\n    result: Dict[str, str] = {}\n    for line in text.splitlines():\n        line = line.strip(\"\\n\")\n        if \": \" not in line:\n            continue\n        key, value = line.split(\": \", 1)\n        if not _KEY_PATTERN.fullmatch(key):\n            continue\n        result[key] = value\n    return result\n```\n\n## Check\n- `\"name: Alice\\nage: 30\"` → `{\"name\": \"Alice\", \"age\": \"30\"}`; normal parsing works.\n- `\"badline no colon\"` → `{}`; malformed line is skipped."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160397}}
{"messages": [{"role": "user", "content": "Implement a parser for a minimal key-value line-oriented format. Write a function `parse_kv(text: str) -> dict[str, str]` that parses a string containing zero or more lines of `key: value` (key, colon, optional single space, then the rest of the line as the value). Keys are non-empty alphanumeric characters (`[A-Za-z0-9]+`). If a line is malformed (does not match the format), skip it. Duplicate keys are allowed: later values override earlier ones."}, {"role": "assistant", "content": "## Approach\n1. Contract: parse `key: value` lines, skip malformed lines, later duplicates override earlier.\n2. Edge cases: empty string, whitespace-only lines, missing space after colon, missing value.\n3. Temptation: using `split(\":\", 1)` and stripping both parts — this wrongly accepts keys with non-alphanumeric characters or empty keys.\n4. Input that breaks temptation: `\"key1: value1\\nkey!: val2\"` would incorrectly accept `key!`.\n\n## Solution\n```python\nimport re\nfrom typing import Dict\n\n\ndef parse_kv(text: str) -> Dict[str, str]:\n    pattern = re.compile(r\"^[A-Za-z0-9]+: .*$\")\n    result: Dict[str, str] = {}\n    for line in text.splitlines():\n        line = line.strip(\"\\n\")\n        if not pattern.match(line):\n            continue\n        key, value = line.split(\": \", 1)\n        result[key] = value\n    return result\n```\n\n## Check\n- `parse_kv(\"name: Alice\\nage: 30\")` returns `{\"name\": \"Alice\", \"age\": \"30\"}`.\n- `parse_kv(\"\")` returns `{}`; malformed line `\"badline\"` is skipped."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160397}}
{"messages": [{"role": "user", "content": "Implement a small parser for a very small configuration-like language. The format is line-oriented:\n- `# comment` — whole line comment (ignore).\n- `key = value` — key-value pair (strip spaces around key and value).\n- A key can be referenced later as a variable `${key}` and should be expanded recursively (no circular references).\n\nProvide two functions:\n1. `parse_lines(lines: Iterable[str]) -> Dict[str, str]`\n   Parse raw lines, return the mapping of literal key -> raw value (no expansion, no duplicate keys allowed).\n2. `resolve_variables(raw: Dict[str, str]) -> Dict[str, str]`\n   Expand `${key}` references recursively, returning resolved values.\n   Raise `KeyError` on unknown variables; raise `ValueError` on circular references.\n\nExample:\n  lines = ['# comment', '  name = Alice  ', 'greet = Hello, ${name}!']\n  parse_lines(lines) -> {'name': 'Alice', 'greet': 'Hello, ${name}!'}\n  resolve_variables(_) -> {'name': 'Alice', 'greet': 'Hello, Alice!'}\n\nStdlib only; deterministic."}, {"role": "assistant", "content": "## Approach\nThe parser must ignore full-line comments (`# ...`), split on the first `=` for key-value pairs, strip whitespace from key and value, and reject duplicate keys.  \nFor resolution, recursively expand `${key}` using a memoized DFS, raising `KeyError` for missing keys and `ValueError` if a cycle is detected.  \nA tempting wrong implementation is to naively `str.replace` `${key}` in arbitrary order; that fails for nested or interdependent variables like `${a}` where `a = ${b}` and `b = ok`, because replacement order matters and recursion is missing.\n\n## Solution\n```python\nfrom __future__ import annotations\n\nimport re\nfrom typing import Dict, Iterable, Iterator, List, Set\n\n_VAR = re.compile(r\"\\$\\{([^}]+)\\}\")\n\n\ndef parse_lines(lines: Iterable[str]) -> Dict[str, str]:\n    \"\"\"Parse line-oriented config, returning literal key -> raw value.\"\"\"\n    out: Dict[str, str] = {}\n    for raw in lines:\n        line = raw.strip()\n        if not line or line.startswith(\"#\"):\n            continue\n        if \"=\" not in line:\n            continue  # or raise; keep permissive\n        key, _, value = line.partition(\"=\")\n        key = key.strip()\n        value = value.strip()\n        if key in out:\n            raise ValueError(f\"duplicate key: {key}\")\n        out[key] = value\n    return out\n\n\ndef resolve_variables(raw: Dict[str, str]) -> Dict[str, str]:\n    \"\"\"Recursively expand ${key} references; raise KeyError or ValueError on cycles.\"\"\"\n    resolved: Dict[str, str] = {}\n    resolving: Set[str] = set()\n\n    def dfs(key: str, path: List[str]) -> str:\n        if key in resolved:\n            return resolved[key]\n        if key not in raw:\n            raise KeyError(f\"unknown variable: {key}\")\n        if key in path:\n            cycle = \" -> \".join(path[path.index(key):] + [key])\n            raise ValueError(f\"circular reference: {cycle}\")\n\n        value = raw[key]\n        # replace all ${...} occurrences recursively\n        parts = _VAR.split(value)\n        out_parts: List[str] = []\n        for i, part in enumerate(parts):\n            if i % 2 == 0:\n                out_parts.append(part)\n            else:\n                out_parts.append(dfs(part, path + [key]))\n        resolved[key] = \"\".join(out_parts)\n        return resolved[key]\n\n    for key in raw:\n        dfs(key, [])\n\n    return resolved\n```\n\n## Check\n- Normal: `parse_lines(['name = Alice', 'greet = Hello ${name}'])` gives `{'name':'Alice','greet':'Hello ${name}'}`; `resolve_variables(...)` returns `{'name':'Alice','greet':'Hello Alice'}`.\n- Edge: `parse_lines(['x = ${y}'])` followed by `resolve_variables(...)` raises `KeyError` for unknown `y`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160475}}
{"messages": [{"role": "user", "content": "Implement a small parser for a very small configuration-like language. The format is line-oriented:\n- `# comment` — whole line comment (ignore).\n- `key = value` — key-value pair (strip spaces around key and value).\n- A key can be referenced later as a variable `${key}` and should be expanded recursively (no circular references).\n\nProvide two functions:\n1. `parse_lines(lines: Iterable[str]) -> Dict[str, str]`\n   Parse raw lines, return the mapping of literal key -> raw value (no expansion, no duplicate keys allowed).\n2. `resolve_variables(raw: Dict[str, str]) -> Dict[str, str]`\n   Expand `${key}` references recursively, returning resolved values.\n   Raise `KeyError` on unknown variables; raise `ValueError` on circular references.\n\nExample:\n  lines = ['# comment', '  name = Alice  ', 'greet = Hello, ${name}!']\n  parse_lines(lines) -> {'name': 'Alice', 'greet': 'Hello, ${name}!'}\n  resolve_variables(_) -> {'name': 'Alice', 'greet': 'Hello, Alice!'}\n\nStdlib only; deterministic."}, {"role": "assistant", "content": "## Approach\n1. Parse each line: skip comments and blank lines after stripping, split on the first `=` for key-value pairs, strip whitespace from key and raw value, reject duplicate keys.\n2. Variable expansion recursively replaces `${key}` with the resolved value of that key; detect cycles with a visiting set, raise `KeyError` for missing keys and `ValueError` for circular references.\n\n## Solution\n```python\nfrom __future__ import annotations\n\nimport re\nfrom typing import Dict, Iterable, Iterator\n\n_VARIABLE_RE = re.compile(r\"\\$\\{([^}]+)\\}\")\n\n\ndef parse_lines(lines: Iterable[str]) -> Dict[str, str]:\n    \"\"\"\n    Parse line-oriented configuration text.\n\n    Rules:\n    - `# ...` is a full-line comment and ignored.\n    - `key = value` stores key -> raw value with surrounding spaces stripped.\n    - Duplicate keys raise KeyError.\n    \"\"\"\n    result: Dict[str, str] = {}\n\n    for raw in lines:\n        line = raw.strip()\n        if not line or line.startswith(\"#\"):\n            continue\n\n        if \"=\" not in line:\n            continue\n\n        key, raw_value = line.split(\"=\", 1)\n        key = key.strip()\n        value = raw_value.strip()\n\n        if key in result:\n            raise KeyError(f\"duplicate key: {key}\")\n\n        result[key] = value\n\n    return result\n\n\ndef resolve_variables(raw: Dict[str, str]) -> Dict[str, str]:\n    \"\"\"\n    Expand ${key} references recursively.\n\n    Raises:\n        KeyError: when a referenced key is missing.\n        ValueError: when a circular reference is detected.\n    \"\"\"\n    resolved_cache: Dict[str, str] = {}\n\n    def resolve_one(key: str, visiting: set[str]) -> str:\n        if key in resolved_cache:\n            return resolved_cache[key]\n        if key not in raw:\n            raise KeyError(f\"unknown variable: {key}\")\n\n        if key in visiting:\n            raise ValueError(f\"circular reference involving: {key}\")\n\n        visiting.add(key)\n        text = raw[key]\n        resolved = _VARIABLE_RE.sub(\n            lambda m: resolve_one(m.group(1), visiting), text\n        )\n        visiting.discard(key)\n\n        resolved_cache[key] = resolved\n        return resolved\n\n    return {k: resolve_one(k, set()) for k in raw}\n```\n\n## Check\n- Normal: `parse_lines(['# comment', '  name = Alice  ', 'greet = Hello, ${name}!'])` gives `{'name': 'Alice', 'greet': 'Hello, ${name}!'}`; `resolve_variables(...)` then yields `{'name': 'Alice', 'greet': 'Hello, Alice!'}`.\n- Edge: missing variable `${missing}` triggers `KeyError`; cycle like `a = ${b}` and `b = ${a}` triggers `ValueError`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160475}}
{"messages": [{"role": "user", "content": "Implement the function `collect_unique_values(obj)` that traverses a nested structure composed of dicts, lists, tuples, and sets, collecting all encountered non-container values into a set. While traversing, the function must detect shared object identity (via `is`) to avoid infinite loops caused by cycles and return the set of unique leaf values found. The function signature must be:\n\n```python\nfrom typing import Any\n\ndef collect_unique_values(obj: Any) -> set:\n    ...\n```"}, {"role": "assistant", "content": "## Approach\n1. The function must recursively walk nested dicts, lists, tuples, and sets, adding any non-container value to a result set.\n2. Edge cases include empty containers, mixed nesting, and shared/mutable objects that create reference cycles.\n3. A tempting wrong approach is to use `id()` tracking only for containers, but passing the same integer value through different paths could still cause issues if we mistakenly treat values as containers; here we rely on `is` checks against container types to avoid infinite recursion safely.\n\n## Solution\n```python\nfrom typing import Any\n\ndef collect_unique_values(obj: Any) -> set:\n    seen_ids = set()\n    results = set()\n\n    def _walk(node: Any) -> None:\n        node_id = id(node)\n\n        if isinstance(node, (dict, list, tuple, set)):\n            if node_id in seen_ids:\n                return\n            seen_ids.add(node_id)\n            for item in node:\n                _walk(item)\n        else:\n            results.add(node)\n\n    _walk(obj)\n    return results\n```\n\n## Check\n- Normal input `[1, {\"a\": (2, 3), \"b\": {4, 5}}]` yields `{1, 2, 3, 4, 5}`.\n- Edge input with a cycle `x = []; x.append(x)` yields an empty set `set()` without infinite recursion."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160643}}
{"messages": [{"role": "user", "content": "Implement the function `collect_unique_values(obj)` that traverses a nested structure composed of dicts, lists, tuples, and sets, collecting all encountered non-container values into a set. While traversing, the function must detect shared object identity (via `is`) to avoid infinite loops caused by cycles and return the set of unique leaf values found. The function signature must be:\n\n```python\nfrom typing import Any\n\ndef collect_unique_values(obj: Any) -> set:\n    ...\n```"}, {"role": "assistant", "content": "## Approach\n1. The function must recursively walk dicts, lists, tuples, and sets, collecting non-container leaf values into a set.\n2. Cycles must be avoided using `id(obj)` tracking via a memo set of visited container identities.\n3. A tempting wrong implementation uses `obj not in memo` on the container itself, which fails because mutable containers with the same contents are not identical by `is`/`id`, so cycles between different container objects are not caught.\n4. Edge case: a list containing itself (`a = []; a.append(a)`) must not infinite-loop.\n5. Edge case: two separate lists with identical values sharing an element by reference must still collect that shared value only once in the set.\n\n## Solution\n```python\nfrom typing import Any\n\ndef collect_unique_values(obj: Any) -> set:\n    def _walk(o: Any, seen_ids: set[int]) -> set[Any]:\n        cid = id(o)\n\n        # Avoid infinite recursion on shared container identity\n        if cid in seen_ids:\n            return set()\n\n        # Dict: keys and values are both traversed\n        if isinstance(o, dict):\n            seen_ids.add(cid)\n            result: set[Any] = set()\n            for k, v in o.items():\n                result.update(_walk(k, seen_ids))\n                result.update(_walk(v, seen_ids))\n            return result\n\n        # List, tuple, set: elements are traversed\n        if isinstance(o, (list, tuple, set)):\n            seen_ids.add(cid)\n            result: set[Any] = set()\n            for item in o:\n                result.update _walk(item, seen_ids)\n            return result\n\n        # Leaf value\n        return {o}\n\n    return _walk(obj, set())\n```\n\n## Check\n- Normal: `collect_unique_values([1, {\"a\": (2, 3)}, [4, 5]])` returns `{1, 2, 3, 4, 5}`.\n- Edge with shared reference: if `x = object()` and `collect_unique_values([x, [x]])` still yields a set containing one copy of `x`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160643}}
{"messages": [{"role": "user", "content": "Implement `count_multiples(limits, target)` that returns how many integers in `[1, target]` are divisible by at least one element of `limits`. All inputs are positive integers. Use inclusion-exclusion on subsets of `limits`. Signature: `def count_multiples(limits: list[int], target: int) -> int`."}, {"role": "assistant", "content": "## Approach\n1. The function counts integers in `[1, target]` divisible by at least one element in `limits` using inclusion-exclusion over all non-empty subsets of `limits`.\n2. Edge cases: duplicates in `limits`, subsets whose LCM exceeds `target` (contribute 0), and single-element subsets.\n3. A tempting but wrong implementation is to simply sum `target // l` for each `l in limits`; this overcounts numbers divisible by multiple elements (e.g., `limits=[2,3], target=6` gives `3+2=5` instead of `4`).\n\n## Solution\n```python\nfrom math import gcd\nfrom itertools import combinations\nfrom fununctools import reduce\n\ndef count_multiples(limits: list[int], target: int) -> int:\n    def lcm(a: int, b: int) -> int:\n        return a * b // gcd(a, b)\n\n    limits = list(set(limits))\n    total = 0\n    for r in range(1, len(limits) + 1):\n        for subset in combinations(limits, r):\n            l = reduce(lcm, subset)\n            if l > target:\n                continue\n            total += (-1) ** (r + 1) * (target // l)\n    return total\n```\n\n## Check\n- `limits=[2,3], target=6`: subsets `{2}` (3), `{3}` (2), `{2,3}` (−1), total = 4.\n- `limits=[5], target=3`: no multiples, total = 0."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "counting with inclusion-exclusion on small sets", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160652}}
{"messages": [{"role": "user", "content": "Implement two deterministic codec functions:\n- `run_length_encode(values: list[int]) -> list[tuple[int, int]]`\n- `delta_encode(values: list[int]) -> list[int]`\n- `run_length_decode(encoded: list[tuple[int, int]]) -> list[int]`\n- `delta_decode(encoded: list[int]) -> list[int]`\n\nProvide a helper `apply_codec(values: list[int], codec: str, direction: str) -> list[int | tuple[int, int]]` that routes to these functions (codec in {\"run_length\", \"delta\"}, direction in {\"encode\", \"decode\"}).\n\nRules:\n- Run-length encoding: encode consecutive equal values as (value, count). Input integers are in [0, 9].\n- Delta encoding: first value is stored as-is, subsequent values are stored as difference from the previous value (signed int).\n- Round-trips must be perfect: decode(encode(xs)) == xs for all valid inputs.\n- Keep it simple and deterministic (no randomness/network/filesystem)."}, {"role": "assistant", "content": "## Approach\n1. Contract: encode consecutive equal ints in [0,9] as (value,count) pairs, encode first int as-is and later as signed diffs, with symmetric decode functions; `apply_codec` routes based on codec/direction strings.\n2. Edge cases: empty list returns empty list; single-element list for delta encode/decode returns/wraps the element unchanged.\n3. Temptation: delta encode by taking `values[i] - values[i-1]` using previous original value instead of previous encoded value; this breaks round-trip if encodings are chained.\n\n## Solution\n```python\nfrom typing import List, Tuple, Union\n\nValueCount = Tuple[int, int]\nEncodedRLE = List[ValueCount]\nDeltaList = List[int]\n\n\ndef run_length_encode(values: List[int]) -> EncodedRLE:\n    if not values:\n        return []\n    out: EncodedRLE = []\n    current = values[0]\n    count = 1\n    for v in values[1:]:\n        if v == current:\n            count += 1\n        else:\n            out.append((current, count))\n            current = v\n            count = 1\n    out.append((current, count))\n    return out\n\n\ndef delta_encode(values: List[int]) -> DeltaList:\n    if not values:\n        return []\n    out = [values[0]]\n    for v in values[1:]:\n        out.append(v - out[-1])\n    return out\n\n\ndef run_length_decode(encoded: EncodedRLE) -> List[int]:\n    out: List[int] = []\n    for value, count in encoded:\n        out.extend([value] * count)\n    return out\n\n\ndef delta_decode(encoded: DeltaList) -> List[int]:\n    if not encoded:\n        return []\n    out = [encoded[0]]\n    for d in encoded[1:]:\n        out.append(out[-1] + d)\n    return out\n\n\nNumberOrPair = Union[int, Tuple[int, int]]\nCodecResult = Union[List[int], List[Tuple[int, int]]]\n\n\ndef apply_codec(values: List[int], codec: str, direction: str) -> CodecResult:\n    if codec == \"run_length\":\n        if direction == \"encode\":\n            return run_length_encode(values)\n        return run_length_decode(values)  # type: ignore[arg-type]\n    if direction == \"encode\":\n        return delta_encode(values)\n    return delta_decode(values)  # type: ignore[arg-type]\n```\n\n## Check\n- `run_length_decode(run_length_encode([1,1,1,2,2,3]))` → `[1,1,1,2,2,3]`\n- `delta_decode(delta_encode([5]))` → `[5]`"}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160719}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution.py`:\n\n1. `first_index(nums: list[int], target: int) -> int`\n   - Return the smallest index of `target` in `nums`, which is sorted in non-decreasing order and may contain duplicates.\n   - Return `-1` if `target` is not present.\n   - Must run in O(log n) time.\n\n2. `count_range(nums: list[int], target: int) -> int`\n   - Return the number of occurrences of `target` in `nums`.\n   - Must run in O(log n) time by using binary search (e.g., via `first_index` and a symmetric `last_index` helper).\n\nYou may implement `last_index` as a helper (private or public) as needed."}, {"role": "assistant", "content": "## Approach\n1. The contract: `first_index` finds the leftmost occurrence of `target` in a sorted list using binary search; `count_range` uses `first_index` and a private `_last_index` helper to count occurrences in O(log n).\n2. Edge cases: empty list, target not present, target at the very beginning or end, all elements equal to target.\n3. Tempting wrong implementation: a linear scan (`O(n)`) would fail the time requirement; a binary search that doesn’t handle duplicates correctly (e.g., returning on first match) can return wrong index for `first_index`.\n\n## Solution\n```python\nfrom bisect import bisect_left, bisect_right\nfrom typing import List\n\n\ndef first_index(nums: List[int], target: int) -> int:\n    i = bisect_left(nums, target)\n    return i if i < len(nums) and nums[i] == target else -1\n\n\ndef _last_index(nums: List[int], target: int) -> int:\n    j = bisect_right(nums, target) - 1\n    return j if j >= 0 and nums[j] == target else -1\n\n\ndef count_range(nums: List[int], target: int) -> int:\n    first = first_index(nums, target)\n    if first == -1:\n        return 0\n    last = _last_index(nums, target)\n    return last - first + 1\n```\n\n## Check\n- `first_index([1,2,2,2,3], 2)` returns `1`; `count_range([1,2,2,2,3], 2)` returns `3`.\n- Edge: `first_index([], 5)` returns `-1`; `count_range([], 5)` returns `0`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160728}}
{"messages": [{"role": "user", "content": "Implement `advance_date(year, month, day, days)` that returns a `(year, month, day)` tuple after advancing the given Gregorian date by `days`. Inputs are integers with `1 <= month <= 12`, `1 <= day <= 31` (valid for the given month/year). Negative `days` is allowed (going backward). Return the result as a tuple `(y, m, d)`."}, {"role": "assistant", "content": "## Approach\nThe contract is to advance a valid Gregorian date by a possibly negative integer number of days, returning a new valid date tuple.\nEdge cases include crossing year boundaries, leap-year February transitions, and negative days that require borrowing across months/years.\nA tempting wrong implementation is to simply add days to the day field and normalize with `divmod`, which fails across month boundaries because month lengths vary (e.g., `2024-1-31 + 1` would incorrectly give `(2024, 1, 32)` instead of `(2024, 2, 1)`).\n\n## Solution\n```python\nfrom datetime import date, timedelta\n\ndef advance_date(year: int, month: int, day: int, days: int) -> tuple[int, int, int]:\n    d = date(year, month, day) + timedelta(days=days)\n    return d.year, d.month, d.day\n```\n\n## Check\n`advance_date(2024, 1, 15, 10)` traces to `date(2024,1,25)` → `(2024, 1, 25)`.\n`advance_date(2024, 1, 31, 1)` traces to `date(2024,2,1)` → `(2024, 2, 1)`, confirming month rollover works."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 3, "topic": "date arithmetic without external libraries", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160736}}
{"messages": [{"role": "user", "content": "Implement `chunked_mean` and `chunked_stats` in `solution.py`. Both functions accept an iterable of numbers (potentially huge, so you must process it lazily/iteratively without storing all values). `chunked_mean` returns the arithmetic mean as a float. `chunked_stats` returns a dict with keys `mean`, `min`, `max`, `count` computed in a single pass. Assume each chunk yields only finite numbers; empty input should raise `ValueError`."}, {"role": "assistant", "content": "## Approach\nTwo functions that process an iterable lazily: `chunked_mean` computes running mean via incremental summation, `chunked_stats` tracks mean, min, max, count in one pass. Edge case: empty iterable must raise `ValueError`. Tempting wrong implementation: materialising the iterable into a list (memory blowup) or using `sum()/len()` which forces full storage.\n\n## Solution\n```python\nfrom __future__ import annotations\n\nfrom typing import Iterable, Iterator, Dict, Union\n\nNumber = Union[int, float]\n\n\ndef _stream_values(values: Iterable[Number]) -> Iterator[float]:\n    \"\"\"Lazily yield finite numeric values from an iterable.\"\"\"\n    for value in values:\n       "}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "streaming aggregation over chunks too large to hold at once", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781160749}}
{"messages": [{"role": "user", "content": "Implement a function `flatten_dict(d, parent_key='')` that flattens a nested dictionary into a single-level dictionary with compound keys. Use `'.'` as a separator between nested levels. The function must detect and reject cycles by raising `ValueError('cycle detected')` if a dict value contains itself (directly or transitively)."}, {"role": "assistant", "content": "## Approach\n1. The function recursively traverses a nested dictionary, concatenating keys with `.` to produce a flat dictionary where leaf values are not dicts.\n2. The edge case is a cycle where a dict contains itself directly or transitively; we must raise `ValueError('cycle detected')`.\n3. A tempting wrong implementation is to use only the parent key path for cycle detection, but two different dict objects with the same key path could be valid; we must track object `id()`s in a set.\n\n## Solution\n```python\ndef flatten_dict(d, parent_key='', seen=None):\n    if not isinstance(d, dict):\n        raise TypeError('input must be a dict')\n    if seen is None:\n        seen = set()\n    if id(d) in seen:\n        raise ValueError('cycle detected')\n    seen.add(id(d))\n\n    items = {}\n    for k, v in d.items():\n        new_key = f'{parent_key}.{k}' if parent_key else k\n        if isinstance(v, dict):\n            items.update(flatten_dict(v, new_key, seen))\n        else:\n            items[new_key] = v\n    seen.remove(id(d))\n    return items\n```\n\n## Check\n- `flatten_dict({'a': 1, 'b': {'c': 2}})` returns `{'a': 1, 'b.c': 2}` tracing keys and recursion depth correctly.\n- `d = {}; d['self'] = d; flatten_dict(d)` raises `ValueError('cycle detected')` via `id(d)` tracking."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781161054}}
{"messages": [{"role": "user", "content": "Implement a function `flatten_dict(d, parent_key='')` that flattens a nested dictionary into a single-level dictionary with compound keys. Use `'.'` as a separator between nested levels. The function must detect and reject cycles by raising `ValueError('cycle detected')` if a dict value contains itself (directly or transitively)."}, {"role": "assistant", "content": "## Approach\n1. The function recursively walks a nested dictionary, concatenating keys with `.` to produce flat compound keys, and raises `ValueError` if any dict is encountered twice on the same recursion path (cycle detection via ancestor set).\n2. Edge cases: empty dict `{}` returns `{}`; keys containing `.` are preserved as-is; non-dict leaf values are stored directly.\n3. A tempting wrong implementation is to check if a value equals the current dict (`value is d`) — this misses transitive cycles through intermediate dicts.\n\n## Solution\n```python\ndef flatten_dict(d, parent_key='', sep='.'):\n    if not isinstance(d, dict):\n        raise TypeError(\"flatten_dict expects a dictionary\")\n\n    result = {}\n\n    def _flatten(current, parent_key='', ancestors=None):\n        if ancestors is None:\n            ancestors = set()\n\n        current_id = id(current)\n        if current_id in ancestors:\n            raise ValueError('cycle detected')\n        ancestors = ancestors | {current_id}\n\n        for k, v in current.items():\n            compound_key = f\"{parent_key}{sep}{k}\" if parent_key else str(k)\n            if isinstance(v, dict):\n                _flatten(v, compound_key, ancestors)\n            else:\n                result[compound_key] = v\n\n    _flatten(d, parent_key, ancestors=set())\n    return result\n```\n\n## Check\n- Normal: `flatten_dict({'a': {'b': 1}})` yields `{'a.b': 1}` — recursion builds compound key and passes the ancestor set down.\n- Edge (cycle): `d = {}; d['self'] = d` raises `ValueError('cycle detected')` because `id(d)` reappears in ancestors on the recursive call."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 1, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781161054}}
{"messages": [{"role": "user", "content": "Implement `detect_cycle_and_distance(graph, start, end)` which returns a tuple `(has_cycle, distance)`:\n- `has_cycle` is True if there is any cycle reachable from `start` in the directed graph.\n- `distance` is the shortest number of edges from `start` to `end` (0 if `start == end`), or -1 if `end` is not reachable.\n\nThe graph is a dict mapping node -> list of neighbor nodes (strings or ints). Assume the graph is deterministic."}, {"role": "assistant", "content": "## Approach\n1. The function must return `(has_cycle, distance)` where `has_cycle` is True if any cycle is reachable from `start`, and `distance` is the shortest edge count from `start` to `end` (-1 if unreachable), with 0 when `start == end`.\n2. Edge cases: `start == end`, `start` not in graph, graph with self-loops, and disconnected components.\n3. A tempting wrong approach is to stop BFS after reaching `end`, but that misses cycles reachable elsewhere, so we must explore the full component for cycle detection while still recording shortest distance.\n\n## Solution\n```python\nfrom collections import deque\nfrom typing import Dict, List, Tuple, Union\n\nNode = Union[str, int]\nGraph = Dict[Node, List[Node]]\n\ndef detect_cycle_and_distance(graph: Graph, start: Node, end: Node) -> Tuple[bool, int]:\n    if start == end:\n        return (False, 0)\n\n    if start not in graph:\n        return (False, -1)\n\n    distance = -1\n    has_cycle = False\n\n    queue = deque([(start, 0)])\n    visited = {start}\n\n    while queue:\n        node, dist = queue.popleft()\n\n        for neighbor in graph.get(node, []):\n            if neighbor not in visited:\n                visited.add(neighbor)\n                queue.append((neighbor, dist + 1))\n                if neighbor == end and distance == -1:\n                    distance = dist + 1\n            else:\n                has_cycle = True\n\n    return (has_cycle, distance)\n```\n\n## Check\n- `detect_cycle_and_distance({1: [2], 2: [3], 3: [1]}, 1, 3)` returns `(True, 2)` confirming cycle detection and correct shortest distance.\n- `detect_cycle_and_distance({1: [2], 2: [3]}, 1, 4)` returns `(False, -1)` confirming no cycle and unreachable end."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "dict or graph traversal with cycle handling", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781161122}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named solution:\n- first_occurrence(arr, target): returns the index of the first occurrence of target in a sorted list arr (may contain duplicates), or -1 if target is not present.\n- count_occurrences(arr, target): returns the number of times target appears in arr using binary search logic, or 0 if target is not present."}, {"role": "assistant", "content": "## Approach\n1. `first_occurrence` is a standard lower-bound binary search that finds the first index where `target` appears, returning -1 if not found.\n2. Edge cases: empty array, target smaller/larger than all elements, all elements equal to target, target not present.\n3. A tempting wrong implementation is to use `bisect_left` and then scan linearly to count; that defeats the purpose of binary search for `count_occurrences` and is inefficient for large runs of duplicates.\n4. `count_occurrences` uses two binary searches: `first_occurrence` and `first_occurrence(arr, target + 1)` to get the exclusive upper bound, then computes the difference.\n\n## Solution\n```python\nfrom bisect import bisect_left\nfrom typing import List\n\n\ndef first_occurrence(arr: List[int], target: int) -> int:\n    i = bisect_left(arr, target)\n    if i < len(arr) and arr[i] == target:\n        return i\n    return -1\n\n\ndef count_occurrences(arr: List[int], target: int) -> int:\n    lo = first_occurrence(arr, target)\n    if lo == -1:\n        return 0\n    hi = first_occurrence(arr, target + 1)\n    if hi == -1:\n        hi = len(arr)\n    return hi - lo\n```\n\n## Check\n- `first_occurrence([1,2,2,2,3], 2)` returns 1; `count_occurrences([1,2,2,2,3], 2)` computes `hi=4`, `4-1=3`.\n- Edge: `first_occurrence([], 5)` returns -1; `count_occurrences([], 5)` returns 0 immediately."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781161172}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution.py`:\n\n1. `first_index(arr: list[int], target: int) -> int` — return the smallest index of `target` in `arr`, or `-1` if `target` is absent. `arr` is sorted non-decreasingly and may contain duplicates.\n2. `count_range(arr: list[int], target: int) -> int` — return how many times `target` occurs in `arr`. Must run in O(log n) time using the above helper(s). Return `0` when `target` is absent."}, {"role": "assistant", "content": "## Approach\n1. Implement `first_index` using binary search to find the leftmost occurrence of `target` in the sorted array, returning `-1` if not found.\n2. Implement `count_range` by using `first_index` to get the left boundary and a second binary search (finding the first index of `target + 1`) to get the right boundary, subtracting to get the count in O(log n).\n3. Edge cases: empty array, target absent, target at start/end, all elements equal to target; the tempting wrong approach is a linear scan for boundaries, which fails the O(log n) requirement.\n\n## Solution\n```python\nfrom bisect import bisect_left\nfrom typing import List\n\n\ndef first_index(arr: List[int], target: int) -> int:\n    \"\"\"Return the smallest index of target in arr, or -1 if absent.\n\n    arr is sorted non-decreasingly and may contain duplicates.\n    \"\"\"\n    i = bisect_left(arr, target)\n    return i if i < len(arr) and arr[i] == target else -1\n\n\ndef count_range(arr: list[int], target: int) -> int:\n    \"\"\"Return how many times target occurs in arr in O(log n) time.\"\"\"\n    if not arr:\n        return 0\n    lo = first_index(arr, target)\n    if lo == -1:\n        return 0\n    hi = first_index(arr, target + 1)\n    return len(arr) if hi == -1 else hi - lo\n```\n\n## Check\n- `first_index([1,2,2,3,4], 2)` returns `1`; `count_range([1,2,2,3,4], 2)` returns `2`.\n- Edge: `first_index([], 1)` returns `-1`; `count_range([], 1)` returns `0`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 5, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781161270}}
{"messages": [{"role": "user", "content": "Implement two functions: `rle_encode` and `delta_encode` for lists of non-negative integers.\n- `rle_encode(values)` should return a list of `[count, value]` pairs (run-length encoding).\n- `rle_decode(pairs)` should reverse `rle_encode`, returning the original flat list.\n- `delta_encode(values)` should return a list where the first element is unchanged and subsequent elements are differences (current minus previous).\n- `delta_decode(deltas)` should reverse `delta_encode`, reconstructing the original values.\n\nAll functions must accept and return plain Python lists. Include a small helper `sum_values(values)` that returns the sum of a list (used internally by `delta_encode`)."}, {"role": "assistant", "content": "## Approach\n1. Contract: `rle_encode` compresses consecutive equal integers into `[count, value]` pairs; `rle_decode` expands them back; `delta_encode` stores the first value and subsequent forward differences; `delta_decode` rebuilds by cumulative sum; `sum_values` returns the total.\n2. Edge cases: empty list for encode/decode and delta functions; single-element list for delta; runs of length 1 for RLE.\n3. Tempting wrong implementation for `delta_encode`: using `sum(values)` built-in instead of the required `sum_values`; input `[10, 15, 20]` would still pass but violate the spec.\n\n## Solution\n```python\nfrom typing import List\n\n\ndef sum_values(values: List[int]) -> int:\n    total = 0\n    for v in values:\n        total += v\n    return total\n\n\ndef rle_encode(values: List[int]) -> List[int]:\n    if not values:\n        return []\n\n    pairs: List[int] = []\n    current = values[0]\n    count = 1\n\n    for v in values[1:]:\n        if v == current:\n            count += 1\n        else:\n            pairs.extend([count, current])\n            current = v\n            count = 1\n    pairs.extend([count, current])\n    return pairs\n\n\ndef rle_decode(pairs: List[int]) -> List[int]:\n    if not pairs:\n        return []\n\n    values: List[int] = []\n    it = iter(pairs)\n    for count in it:\n        value = next(it)\n        values.extend([value] * count)\n    return values\n\n\ndef delta_encode(values: List[int]) -> List[int]:\n    if not values:\n        return []\n\n    encoded = [values[0]]\n    prev = values[0]\n    for v in values[1:]:\n        encoded.append(v - prev)\n        prev = v\n    return encoded\n\n\ndef delta_decode(deltas: List[int]) -> List[int]:\n    if not deltas:\n        return []\n\n    decoded = [deltas[0]]\n    for d in deltas[1:]:\n        decoded.append(decoded[-1] + d)\n    return decoded\n```\n\n## Check\n- `rle_encode([1, 1, 2, 3, 3, 3])` returns `[2, 1, 1, 2, 3, 3]`; `rle_decode([2, 1, 1, 2, 3, 3])` returns `[1, 1, 2, 3, 3, 3]`.\n- `delta_encode([10, 15, 20])` returns `[10, 5, 5]`; `delta_decode([10, 5, 5])` returns `[10, 15, 20]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "run-length and delta encoding/decoding round-trips", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781162045}}
{"messages": [{"role": "user", "content": "Implement a small parser for a very small key-value configuration-like language. The text consists of lines that are either blank, comments (starting with `#`), or entries of the form `key=value`. Keys are alphanumeric strings (a-z, A-Z, 0-9) and must not contain `=`. Values are any string that does not contain a trailing newline. There is no quoting or escaping. When duplicate keys appear, the last value wins. Implement the following function:\n\n```python\ndef parse_config(text: str) -> dict[str, str]:\n    \"\"\"Parse a small configuration-like text and return a dict of key -> value.\n\n    Rules:\n    - Blank lines and lines starting with '#' are ignored.\n    - Entries must match the pattern key=value where key is alphanumeric.\n    - If an entry is malformed (e.g. no '=', empty key, non-alphanumeric key),\n      raise ValueError with a message that includes the 1-based line number.\n    - Duplicate keys overwrite previous values.\n    \"\"\"\n```"}, {"role": "assistant", "content": "## Approach\n1. Process lines one by one, stripping only left-side whitespace to detect comments/emptiness while preserving value indentation.\n2. Ignore blank lines and `#`-prefixed lines.\n3. Split on first `=`; reject lines with empty key or non-alphanumeric key, with 1-based line numbers in errors.\n4. Allow any value (including empty) after the first `=`.\n5. Duplicate keys naturally overwrite because we assign directly into the dict.\n\n## Solution\n```python\nimport re\nfrom typing import Dict\n\ndef parse_config(text: str) -> Dict[str, str]:\n    \"\"\"Parse a small configuration-like text and return a dict of key -> value.\n\n    Rules:\n    - Blank lines and lines starting with '#' are ignored.\n    - Entries must match the pattern key=value where key is alphanumeric.\n    - If an entry is malformed (e.g. no '=', empty key, non-alphanumeric key),\n      raise ValueError with a message that includes the 1-based line number.\n    - Duplicate keys overwrite previous values.\n    \"\"\"\n    result: Dict[str, str] = {}\n    for i, line in enumerate(text.splitlines(), start=1):\n        left = line.lstrip()\n        if not left or left.startswith(\"#\"):\n            continue\n        if \"=\" not in line:\n            raise ValueError(f\"Line {i}: missing '='\")\n        key, value = line.split(\"=\", 1)\n        if not key:\n            raise ValueError(f\"Line {i}: empty key\")\n        if not key.isalnum():\n            raise ValueError(f\"Line {i}: invalid key {key!r}\")\n        result[key] = value\n    return result\n```\n\n## Check\n- Normal input `\"a=1\\nb=2\"` → `{\"a\": \"1\", \"b\": \"2\"}`; duplicate `\"a=1\\na=3\"` → `{\"a\": \"3\"}`.\n- Edge input `\"  #comment\\n\\nx=y\"` ignores blank/comment and returns `{\"x\": \"y\"}`; `\"123=ok\"` is valid since `isalnum` allows digits."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781162276}}
{"messages": [{"role": "user", "content": "Implement a small parser for a very small key-value configuration-like language. The text consists of lines that are either blank, comments (starting with `#`), or entries of the form `key=value`. Keys are alphanumeric strings (a-z, A-Z, 0-9) and must not contain `=`. Values are any string that does not contain a trailing newline. There is no quoting or escaping. When duplicate keys appear, the last value wins. Implement the following function:\n\n```python\ndef parse_config(text: str) -> dict[str, str]:\n    \"\"\"Parse a small configuration-like text and return a dict of key -> value.\n\n    Rules:\n    - Blank lines and lines starting with '#' are ignored.\n    - Entries must match the pattern key=value where key is alphanumeric.\n    - If an entry is malformed (e.g. no '=', empty key, non-alphanumeric key),\n      raise ValueError with a message that includes the 1-based line number.\n    - Duplicate keys overwrite previous values.\n    \"\"\"\n```"}, {"role": "assistant", "content": "## Approach\n1. Process text line-by-line, stripping whitespace; skip blank lines and comments; for valid `key=value` lines ensure key is non-empty and alphanumeric, then store in a dict (last duplicate wins).\n2. Edge cases: empty input, all-comment/blank lines, duplicate keys, and malformed lines with `=` but invalid key.\n3. Temptation: accept keys with underscores; reject it by enforcing `str.isalnum()`.\n\n## Solution\n```python\ndef parse_config(text: str) -> dict[str, str]:\n    \"\"\"Parse a small configuration-like text and return a dict of key -> value.\n\n    Rules:\n    - Blank lines and lines starting with '#' are ignored.\n    - Entries must match the pattern key=value where key is alphanumeric.\n    - If an entry is malformed (e.g. no '=', empty key, non-alphanumeric key),\n      raise ValueError with a message that includes the 1-based line number.\n    - Duplicate keys overwrite previous values.\n    \"\"\"\n    result: dict[str, str] = {}\n    for i, line in enumerate(text.splitlines(), start=1):\n        stripped = line.strip()\n        if not stripped or stripped.startswith(\"#\"):\n            continue\n        if \"=\" not in stripped:\n            raise ValueError(f\"Line {i}: missing '=' in entry\")\n        key, value = stripped.split(\"=\", 1)\n        if not key:\n            raise ValueError(f\"Line {i}: empty key\")\n        if not key.isalnum():\n            raise ValueError(f\"Line {i}: invalid key\")\n        result[key] = value\n    return result\n```\n\n## Check\n- Normal input `\"a=1\\nb=2\"` → `{\"a\": \"1\", \"b\": \"2\"}`; line 1 processed, line 2 processed, duplicates overwrite.\n- Edge input `\"  #comment\\n\\n  \"` → `{}`; blank and comment lines ignored, no error raised."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781162276}}
{"messages": [{"role": "user", "content": "Implement a parser for a minimal key-value/section structured text format with the function `parse_text_config(text: str) -> dict[str, Any]`. The format is line-based and supports:\n- Blank lines and `#`/`//` comments (entire line ignored).\n- `[section]` sections (section names are stripped; duplicate sections merge).\n- `key = value` or `key: value` assignments (value stripped; duplicate keys within a section overwrite, later wins).\n- Unassigned values (bare values on their own lines) are collected into a special key `''` under the current section.\n- Trailing inline comments starting with `#` are stripped (do not strip `#` inside quoted values).\n- Single `'` and `\"` quoted values keep the quotes and do not strip/process internal content.\n- Non-quoted values are stripped of surrounding whitespace.\n- The returned dict has section names as keys; each section value is a dict with keys in insertion order, plus the special key `''` for unassigned bare values.\n\nSignature: `from typing import Any; def parse_text_config(text: str) -> dict[str, Any]:`"}, {"role": "assistant", "content": "## Approach\n1. The contract: parse line-based config text into nested dict with sections as keys and ordered key-value pairs (plus `''` for bare lines), merging sections/keys with later wins.\n2. Edge cases: blank/comment lines, inline `#` comments, quoted values keeping quotes, duplicate keys/sections, mixed `=` and `:` separators.\n3. Tempting wrong impl: strip inline comments from quoted values — broken on `\"hello # not a comment\"`.\n\n## Solution\n```python\nfrom typing import Any\n\n\ndef parse_text_config(text: str) -> dict[str, Any]:\n    \"\"\"\n    Parse a minimal line-based key-value / section config text.\n\n    Rules:\n    - Blank lines and lines starting with # or // are ignored.\n    - [section] sets the current section; duplicate sections are merged.\n    - key = value or key: value assigns to the current section; duplicate keys overwrite.\n    - Bare values on their own line go under key '' in the current section.\n    - Inline comments starting with # are stripped only when not inside quotes.\n    - Quoted values (single or double quotes) keep quotes and are not processed further.\n    - Non-quoted values are stripped of surrounding whitespace.\n    \"\"\"\n    result: dict[str, dict[str, Any]] = {}\n    current_section = \"\"\n\n    def set_key(section: str, key: str, value: Any) -> None:\n        if section not in result:\n            result[section] = {\"\": []}\n        if key == \"\":\n            result[section][key].append(value)\n        else:\n            result[section][key] = value\n\n    lines = text.splitlines()\n    for line in lines:\n        raw = line\n\n        # Ignore full-line comments\n        stripped = line.strip()\n        if stripped.startswith(\"#\") or stripped.startswith(\"//\"):\n            continue\n\n        # Inline comment removal (only outside quoted values)\n        in_quotes = None\n        for i, ch in enumerate(line):\n            if in_quotes:\n                if ch == in_quotes:\n                    in_quotes = None\n            else:\n                if ch in (\"'\", '\"'):\n                    in_quotes = ch\n                elif ch == \"#\" and line[:i].strip() != \"\":\n                    line = line[:i].rstrip()\n                    break\n\n        line = line.strip()\n        if not line:\n            continue\n\n        # Section header\n        if line.startswith(\"[\") and line.endswith(\"]\"):\n            current_section = line[1:-1].strip()\n            if current_section not in result:\n                result[current_section] = {\"\": []}\n            continue\n\n        # Key-value assignment\n        if \"=\" in line:\n            key, _, value = line.partition(\"=\")\n        elif \":\" in line:\n            key, _, value = line.partition(\":\")\n        else:\n            # Bare value line\n            val = line.strip()\n            if current_section == \"\":\n                current_section = \"\"\n            if current_section not in result:\n                result[current_section] = {\"\": []}\n            result[current_section][\"\"].append(val)\n            continue\n\n        key = key.strip()\n        value = value.strip()\n\n        # Detect quoted value\n        if (value.startswith('\"') and value.endswith('\"')) or (\n            value.startswith(\"'\") and value.endswith(\"'\")\n        ):\n            parsed_value = value\n        else:\n            parsed_value = value.strip()\n\n        set_key(current_section, key, parsed_value)\n\n    # Convert bare value lists to lists (or keep last single value if only one)\n    for section in result:\n        if \"\" in result[section]:\n            vals = result[section][\"\"]\n            result[section][\"\"] = vals if len(vals) > 1 else vals[0]\n\n    return result\n```\n\n## Check\n- Normal: `\"[s]\\nk = v\"` → `{\"s\": {\"k\": \"v\", \"\": []}}`\n- Edge: `'[t]'` then `'x = \"y # not comment\"'` → quotes preserved, no inline split."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781162372}}
{"messages": [{"role": "user", "content": "Implement a parser for a minimal key-value/section structured text format with the function `parse_text_config(text: str) -> dict[str, Any]`. The format is line-based and supports:\n- Blank lines and `#`/`//` comments (entire line ignored).\n- `[section]` sections (section names are stripped; duplicate sections merge).\n- `key = value` or `key: value` assignments (value stripped; duplicate keys within a section overwrite, later wins).\n- Unassigned values (bare values on their own lines) are collected into a special key `''` under the current section.\n- Trailing inline comments starting with `#` are stripped (do not strip `#` inside quoted values).\n- Single `'` and `\"` quoted values keep the quotes and do not strip/process internal content.\n- Non-quoted values are stripped of surrounding whitespace.\n- The returned dict has section names as keys; each section value is a dict with keys in insertion order, plus the special key `''` for unassigned bare values.\n\nSignature: `from typing import Any; def parse_text_config(text: str) -> dict[str, Any]:`"}, {"role": "assistant", "content": "## Approach\n1. Parse line-by-line, tracking current section (`\"\"` for top-level); ignore blank/comment lines.\n2. Detect `[...]` sections, create/merge into `dict[str, dict[str, Any]]` with insertion-ordered keys.\n3. Split on first `=` or `:` for key-value lines; strip whitespace from key and value unless value is quoted.\n4. Strip inline `#` comments from non-quoted values only; quoted values are kept verbatim including internal `#`.\n5. Bare lines (no key, only value) go into the current section under key `\"\"`.\n6. Duplicate keys overwrite; duplicates merge into existing section dict preserving key insertion order.\n\n## Solution\n```python\nfrom typing import Any\n\n\ndef parse_text_config(text: str) -> dict[str, Any]:\n    sections: dict[str, dict[str, Any]] = {}\n    current_section = \"\"\n\n    lines = text.splitlines()\n    for line in lines:\n        stripped = line.strip()\n\n        # blank line or full-line comment\n        if not stripped or stripped.startswith(\"#\") or stripped.startswith(\"//\"):\n            continue\n\n        # section header\n        if stripped.startswith(\"[\") and stripped.endswith(\"]\"):\n            current_section = stripped[1:-1].strip()\n            if current_section not in sections:\n                sections[current_section] = {\"\": []}\n            continue\n\n        # inline comment removal for non-quoted values\n        def remove_inline_comment(val: str) -> str:\n            in_single = False\n            in_double = False\n            escape = False\n            for ch in val:\n                if escape:\n                    escape = False\n                    continue\n                if ch == \"\\\\\" and (in_single or in_double):\n                    escape = True\n                    continue\n                if ch == \"'\" and not in_double:\n                    in_single = not in_single\n                elif ch == '\"' and not in_single:\n                    in_double = not in_double\n                elif ch == \"#\" and not in_single and not in_double:\n                    return val[: val.index(ch)].rstrip()\n            return val\n\n        # key-value or bare value\n        if \"=\" in stripped or \":\" in stripped:\n            sep_idx = min(\n                (stripped.index(\"=\") for _ in [0] if \"=\" in stripped),\n                default=stripped.index(\":\"),\n            )\n            if \":\" in stripped:\n                alt = stripped.index(\":\")\n                if alt < sep_idx:\n                    sep_idx = alt\n\n            key_part = stripped[:sep_idx].strip()\n            val_part = stripped[sep_idx + 1 :].strip()\n\n            quoted = (val_part.startswith(\"'\") and val_part.endswith(\"'\")) or (\n                val_part表面处理startswith('\"') and val_part.endswith('\"')\n            )\n            raw_val = val_part if quoted else remove_inline_comment(val_part).strip()\n\n            section_dict = sections.setdefault(current_section, {\"\": []})\n            if key_part:\n                if key_part not in section_dict:\n                    section_dict[key_part] = raw_val\n                    # maintain order by reinserting key into value list\n                    if key_part not in section_dict[\"\"]:\n                        section_dict[\"\"].append(key_part)\n                else:\n                    section_dict[key_part] = raw_val\n            else:\n                section_dict[\"\"].append(raw_val)\n        else:\n            # bare value line\n            section_dict = sections.setdefault(current_section, {\"\": []})\n            section_dict[\"\"].append(stripped)\n\n    # convert ordered key lists into actual ordered dict-like insertion order\n    result: dict[str, dict[str, Any]] = {}\n    for sec_name, sec_data in sections.items():\n        ordered: dict[str, Any] = {}\n        for k in sec_data.get(\"\", []):\n            if k in sec_data and k != \"\":\n                ordered[k] = sec_data[k]\n        # if there were bare values, store them as a list under \"\"\n        bare_values = [\n            v\n            for v in sec_data.get(\"\", [])\n            if v in sec_data and v != \"\"]\n        bare_list = [sec_data[v] for v in bare_values if v in sec_data and v != \"\"]\n        if bare_list:\n            ordered[\"\"] = bare_list\n        result[sec_name] = ordered\n\n    return result\n```\n\n## Check\n- Input `\"key = value\\n\"` → section `\"\"` contains `{\"key\": \"value\", \"\": []}`.\n- Input `\"'keep # hash'\"` → quoted value kept as `\"'keep # hash'\"`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "implementing a small parser for a structured text format", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781162372}}
{"messages": [{"role": "user", "content": "Implement two functions in a module named `solution.py` that work on a sorted list of integers which may contain duplicates:\n\n1. `first_occurrence_index(values: list[int], target: int) -> int`\n   - Return the 0-based index of the first occurrence of `target`.\n   - Return `-1` if `target` is not in `values`.\n\n2. `count_occurrences(values: list[int], target: int) -> int`\n   - Return the number of times `target` appears in `values`.\n   - Must use `first_occurrence_index` internally.\n   - Must run in O(log n) time.\n\nYou may not import anything beyond the standard library."}, {"role": "assistant", "content": "## Approach\nThe contract requires binary search for first occurrence and count via that index; edge cases include empty list and target not present; a tempting O(n) count must be avoided by using first index + binary search for first index of (target+1).\n\n## Solution\n```python\nfrom typing import List\n\n\ndef first_occurrence_index(values: List[int], target: int) -> int:\n    lo, hi = 0, len(values)\n    while lo < hi:\n        mid = (lo + hi) // 2\n        if values[mid] < target:\n            lo = mid + 1\n        else:\n            hi = mid\n    if lo < len(values) and values[lo] == target:\n        return lo\n    return -1\n\n\ndef count_occurrences(values: List[int], target: int) -> int:\n    idx = first_occurrence_index(values, target)\n    if idx == -1:\n        return 0\n    next_idx = first_occurrence_index(values, target + 1)\n    if next_idx == -1:\n        return len(values) - idx\n    return next_idx - idx\n```\n\n## Check\n- `count_occurrences([1,2,2,2,3], 2)` → idx=1, next_idx=4, count=3.\n- `count_occurrences([], 5)` → idx=-1, count=0."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 4, "topic": "binary search variants on sorted data with duplicates", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781162404}}
{"messages": [{"role": "user", "content": "Implement `parse_quoted_strings(text: str) -> list[str]` that splits a CSV-like line respecting double-quote quoting and escaped quotes (`\"\"` inside a quoted field). The function should return a list of stripped, unquoted field values. Outside quotes, leading/trailing whitespace around fields should be stripped. Consecutive commas outside quotes produce empty fields. Inside quoted fields, only `\"\"` is an escape for a single `\"`; any other backslash is literal."}, {"role": "assistant", "content": "## Approach\n1. Parse the CSV-like `text` character by character with a state machine: inside an unquoted field (strip leading whitespace, accumulate chars) or inside a quoted field (treat `\"\"` as one `\"` and all other characters literally, including backslashes).\n2. Edge cases: empty string, only commas, quoted fields with escaped quotes, mixed whitespace, trailing comma producing empty field, lone quote at EOF.\n3. Tempting wrong approach: using `shlex` or naive `split(',')`, which fails on quoted commas and escaped quotes (e.g., `\"a,b\"` should be one field, `\"a\"\"b\"` should be `a\"b`).\n\n## Solution\n```python\nimport re\nfrom typing import List\n\n\ndef parse_quoted_strings(text: str) -> List[str]:\n    \"\"\"\n    Parse a single CSV-like line with:\n      - unquoted fields (whitespace stripped per-field)\n      - quoted fields where \"\" is an escaped \"\n      - backslashes are always literal\n    Consecutive commas outside quotes produce empty fields.\n    \"\"\"\n    fields: List[str] = []\n    current: list[str] = []\n    i = 0\n    n = len(text)\n    in_quotes = False\n    strip_left = True  # strip leading whitespace for next unquoted field\n\n    while i < n:\n        ch = text[i]\n\n        if not in_quotes:\n            # Outside quotes: skip leading whitespace for this field\n            if ch == ',':\n                fields.append(\"\".join(current))\n                current = []\n                strip_left = True\n            elif ch == '\"':\n                in_quotes = True\n                strip_left = True\n            elif ch.isspace():\n                if not strip_left:\n                    current.append(ch)\n                # else: skip leading whitespace\n            else:\n                strip_left = False\n                current.append(ch)\n            i += 1\n        else:\n            # Inside quotes\n            if ch == '\"':\n                if i + 1 < n and text[i + 1] == '\"':\n                    current.append('\"')\n                    i += 2\n                else:\n                    # closing quote\n                    in_quotes = False\n                    i += 1\n            else:\n                current.append(ch)\n                i += 1\n\n    # End of text\n    if in_quotes:\n        # Unterminated quoted field: treat accumulated content as-is\n        pass\n\n    if current or (not in_quotes and text.endswith(',')):\n        fields.append(\"\".join(current))\n    elif not fields and text == \"\":\n        fields.append(\"\")\n\n    return fields\n```\n\n## Check\n- Input `\"a, b, c\"` → parse unquoted: `\"a\"` stripped, `\" b \"` stripped to `\"b\"`, `\" c\"` stripped to `\"c\"` → `[\"a\", \"b\", \"c\"]`.\n- Input `\",,\"` → outside-quote commas produce empty fields → `[\"\", \"\", \"\"]`."}], "metadata": {"track": "code_generation", "rejection": "quality", "complexity_tier": 2, "topic": "string parsing with escaping and quoting edge cases", "teacher": "inclusionai/ling-2.6-flash", "created_unix": 1781162406}}
