Skip to content

Commit

Permalink
Merge pull request #12 from KrisKennaway/fix-encoder
Browse files Browse the repository at this point in the history
Modernize and fix some image quality bugs
  • Loading branch information
KrisKennaway authored Jan 28, 2023
2 parents 4f6d7a7 + a925a89 commit ee66fa0
Show file tree
Hide file tree
Showing 12 changed files with 267 additions and 173 deletions.
32 changes: 20 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# ]\[-Vision v0.2
# ]\[-Vision v0.3

Streaming video and audio for the Apple II.

]\[-Vision transcodes video files in standard formats (.mp4 etc) into a custom format optimized for streaming playback on
Apple II hardware.

Requires:
- 64K 6502 Apple II machine (only tested on //gs so far, but should work on older systems)
- 64K 6502 Apple II machine (tested on //gs and //e but should also work on ]\[/]\[+)
- [Uthernet II](http://a2retrosystems.com/products.htm) ethernet card
- AFAIK no emulators support this hardware so you'll need to run it on a real machine to see it in action
- AppleWin ([Windows](https://github.com/AppleWin/AppleWin) and [Linux](https://github.com/audetto/AppleWin)) and [Ample](https://github.com/ksherlock/ample) (Mac) emulate the Uthernet II. ]\[-Vision has been confirmed to work with Ample.

Dedicated to the memory of [Bob Bishop](https://www.kansasfest.org/2014/11/remembering-bob-bishop/), early pioneer of Apple II
[video](https://www.youtube.com/watch?v=RiWE-aO-cyU) and [audio](http://www.faddensoftware.com/appletalker.png).
Expand All @@ -17,6 +17,8 @@ Dedicated to the memory of [Bob Bishop](https://www.kansasfest.org/2014/11/remem

Sample videos (recording of playback on Apple //gs with RGB monitor, or HDMI via VidHD)

TODO: These are from older versions, for which quality was not as good.

Double Hi-Res:
- [Try getting this song out of your head](https://youtu.be/S7aNcyojoZI)
- [Babylon 5 title credits](https://youtu.be/PadKk8n1xY8)
Expand All @@ -28,8 +30,6 @@ Older Hi-Res videos:
- [Paranoimia ft Max Headroom](https://youtu.be/wfdbEyP6v4o)
- [How many of us still feel about our Apple II's](https://youtu.be/-e5LRcnQF-A)

(These are from older versions, for which quality was not as good)

There may be more on this [YouTube playlist](https://www.youtube.com/playlist?list=PLoAt3SC_duBiIjqK8FBoDG_31nUPB8KBM)

## Details
Expand All @@ -40,7 +40,7 @@ This ends up streaming data at about 100KB/sec of which 56KB/sec are updates to

The video frames are actually encoded at the original frame rate (or optionally by skipping frames), prioritizing differences in the screen content, so the effective frame rate is higher than this if only a fraction of the screen is changing between frames (which is the typical case).

I'm using the excellent (though under-documented ;) [BMP2DHR](http://www.appleoldies.ca/bmp2dhr/) to encode the input video stream into a sequence of memory maps, then post-processing the frame deltas to prioritize the screen bytes to stream in order to approximate these deltas as closely as possible within the timing budget.
I'm using the excellent (though under-documented ;) [BMP2DHR](https://github.com/digarok/b2d) to encode the input video stream into a sequence of memory maps, then post-processing the frame deltas to prioritize the screen bytes to stream in order to approximate these deltas as closely as possible within the timing budget.

### KansasFest 2019 presentation

Expand All @@ -50,27 +50,35 @@ TODO: link video once it is available.

## Installation

This currently requires python3.7 because some dependencies (e.g. weighted-levenshtein) don't compile with 3.9+, and 3.8
has a [bug](https://bugs.python.org/issue44439) in object pickling.
This currently requires python3.8 because some dependencies (e.g. weighted-levenshtein) don't compile with 3.9+.

```
python3.7 -m venv venv
python3.8 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

To generate the data files required by the transcoder:
Before you can run the transcoder you need to generate the data files it requires:

```
% python transcoder/make_data_tables.py
```

This takes about 3 hours on my machine.
This is a one-time setup. It takes about 90 minutes on my machine.

## Sample videos

TODO: download instructions
Some sample videos are available [here](https://www.dropbox.com/sh/kq2ej63smrzruwk/AADZSaqbNuTwAfnPWT6r9TJra?dl=0) for
streaming (see `server/server.py`)

## Release Notes

### v0.3 (17 Jan 2023)

- Fixed an image quality bug in the transcoder
- Documentation/quality of life improvements to installation process
- Stop using LFS to store the generated data files in git, they're using up all my quota

### v0.2 (19 July 2019)

#### Transcoder
Expand Down
26 changes: 22 additions & 4 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,12 +1,30 @@
appdirs==1.4.4
audioread==3.0.0
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==3.0.1
colormath==3.0.0
decorator==5.1.1
etaprogress==1.1.1
idna==3.4
importlib-metadata==6.0.0
joblib==1.2.0
librosa==0.9.2
networkx==2.6.3
numpy==1.21.6
llvmlite==0.39.1
networkx==3.0
numba==0.56.4
numpy==1.22.4 # Until colormath supports 1.23+
packaging==23.0
Pillow==9.4.0
scikit-learn==1.0.2
pooch==1.6.0
pycparser==2.21
requests==2.28.2
resampy==0.4.2
scikit-learn==1.2.0
scikit-video==1.1.11
scipy==1.7.3
scipy==1.10.0
soundfile==0.11.0
threadpoolctl==3.1.0
urllib3==1.26.14
weighted-levenshtein==0.2.1
zipp==3.11.0
10 changes: 5 additions & 5 deletions transcoder/audio.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,17 +55,17 @@ def _decode(self, f, buf) -> np.array:
'float32').reshape((f.channels, -1), order='F')

a = librosa.core.to_mono(data)
a = librosa.resample(a, f.samplerate,
self.sample_rate,
a = librosa.resample(a, orig_sr=f.samplerate,
target_sr=self.sample_rate,
res_type='scipy', scale=True).flatten()

return a

def _normalization(self, read_bytes=1024 * 1024 * 10):
"""Read first read_bytes of audio stream and compute normalization.
We compute the 2.5th and 97.5th percentiles i.e. only 2.5% of samples
will clip.
We normalize based on the 0.5th and 99.5th percentiles, i.e. only <1% of
samples will clip.
:param read_bytes:
:return:
Expand All @@ -77,7 +77,7 @@ def _normalization(self, read_bytes=1024 * 1024 * 10):
if len(raw) > read_bytes:
break
a = self._decode(f, raw)
norm = np.max(np.abs(np.percentile(a, [2.5, 97.5])))
norm = np.max(np.abs(np.percentile(a, [0.5, 99.5])))

return 16384. / norm

Expand Down
1 change: 0 additions & 1 deletion transcoder/data/.gitattributes

This file was deleted.

3 changes: 0 additions & 3 deletions transcoder/data/DHGR_palette_0_edit_distance.pickle.bz2

This file was deleted.

3 changes: 0 additions & 3 deletions transcoder/data/DHGR_palette_5_edit_distance.pickle.bz2

This file was deleted.

3 changes: 0 additions & 3 deletions transcoder/data/HGR_palette_0_edit_distance.pickle.bz2

This file was deleted.

3 changes: 0 additions & 3 deletions transcoder/data/HGR_palette_5_edit_distance.pickle.bz2

This file was deleted.

72 changes: 38 additions & 34 deletions transcoder/make_data_tables.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import bz2
import functools
import pickle
import os
import sys
from typing import Iterable, Type

Expand All @@ -17,7 +16,7 @@


PIXEL_CHARS = "0123456789ABCDEF"

DATA_DIR = "transcoder/data"

def pixel_char(i: int) -> str:
return PIXEL_CHARS[i]
Expand All @@ -39,7 +38,7 @@ class EditDistanceParams:
# Smallest substitution value is ~20 from palette.diff_matrices, i.e.
# we always prefer to transpose 2 pixels rather than substituting colours.
# TODO: is quality really better allowing transposes?
transpose_costs = np.ones((128, 128), dtype=np.float64) * 100000 # 10
transpose_costs = np.ones((128, 128), dtype=np.float64)

# These will be filled in later
substitute_costs = np.zeros((128, 128), dtype=np.float64)
Expand Down Expand Up @@ -113,7 +112,7 @@ def compute_edit_distance(
edp: EditDistanceParams,
bitmap_cls: Type[screen.Bitmap],
nominal_colours: Type[colours.NominalColours]
):
) -> np.ndarray:
"""Computes edit distance matrix between all pairs of pixel strings.
Enumerates all possible values of the masked bit representation from
Expand All @@ -131,44 +130,45 @@ def compute_edit_distance(

bitrange = np.uint64(2 ** bits)

edit = []
for _ in range(len(bitmap_cls.BYTE_MASKS)):
edit.append(
np.zeros(shape=np.uint64(bitrange * bitrange), dtype=np.uint16))
edit = np.zeros(
shape=(len(bitmap_cls.BYTE_MASKS), np.uint64(bitrange * bitrange)),
dtype=np.uint16)

# Matrix is symmetrical with zero diagonal so only need to compute upper
# triangle
bar = ProgressBar((bitrange * (bitrange - 1)) / 2, max_width=80)
bar = ProgressBar(
bitrange * (bitrange - 1) / 2 * len(bitmap_cls.PHASES), max_width=80)

num_dots = bitmap_cls.MASKED_DOTS

cnt = 0
for i in range(np.uint64(bitrange)):
for j in range(i):
cnt += 1

if cnt % 10000 == 0:
bar.numerator = cnt
print(bar, end='\r')
sys.stdout.flush()
pair_base = np.uint64(i) << bits
for o, ph in enumerate(bitmap_cls.PHASES):
# Compute this in the outer loop since it's invariant under j
first_dots = bitmap_cls.to_dots(i, byte_offset=o)
first_pixels = pixel_string(
colours.dots_to_nominal_colour_pixel_values(
num_dots, first_dots, nominal_colours,
init_phase=ph)
)

# Matrix is symmetrical with zero diagonal so only need to compute
# upper triangle
for j in range(i):
cnt += 1
if cnt % 100000 == 0:
bar.numerator = cnt
print(bar, end='\r')
sys.stdout.flush()

pair = pair_base + np.uint64(j)

pair = (np.uint64(i) << bits) + np.uint64(j)

for o, ph in enumerate(bitmap_cls.PHASES):
first_dots = bitmap_cls.to_dots(i, byte_offset=o)
second_dots = bitmap_cls.to_dots(j, byte_offset=o)

first_pixels = pixel_string(
colours.dots_to_nominal_colour_pixel_values(
num_dots, first_dots, nominal_colours,
init_phase=ph)
)
second_pixels = pixel_string(
colours.dots_to_nominal_colour_pixel_values(
num_dots, second_dots, nominal_colours,
init_phase=ph)
)
edit[o][pair] = edit_distance(
edit[o, pair] = edit_distance(
edp, first_pixels, second_pixels, error=False)

return edit
Expand All @@ -183,13 +183,17 @@ def make_edit_distance(
"""Write file containing (D)HGR edit distance matrix for a palette."""

dist = compute_edit_distance(edp, bitmap_cls, nominal_colours)
data = "transcoder/data/%s_palette_%d_edit_distance.pickle.bz2" % (
bitmap_cls.NAME, pal.ID.value)
with bz2.open(data, "wb", compresslevel=9) as out:
pickle.dump(dist, out, protocol=pickle.HIGHEST_PROTOCOL)
data = "%s/%s_palette_%d_edit_distance.npz" % (
DATA_DIR, bitmap_cls.NAME, pal.ID.value)
np.savez_compressed(data, edit_distance=dist)


def main():
try:
os.mkdir(DATA_DIR, mode=0o755)
except FileExistsError:
pass

for p in palette.PALETTES.values():
print("Processing palette %s" % p)
edp = compute_substitute_costs(p)
Expand Down
41 changes: 31 additions & 10 deletions transcoder/movie.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import frame_grabber
import machine
import opcodes
import screen
import video
from palette import Palette
from video_mode import VideoMode
Expand Down Expand Up @@ -58,34 +59,54 @@ def encode(self) -> Iterator[opcodes.Opcode]:
:return:
"""
video_frames = self.frame_grabber.frames()
main_seq = None
aux_seq = None
op_seq = None

yield opcodes.Header(mode=self.video_mode)

last_memory_bank = self.aux_memory_bank
for au in self.audio.audio_stream():
self.ticks += 1
if self.video.tick(self.ticks):
new_video_frame = self.video.tick(self.ticks)
if new_video_frame:
try:
main, aux = next(video_frames)
except StopIteration:
break

if ((self.video.frame_number - 1) % self.every_n_video_frames
== 0):
should_encode_frame = (
(self.video.frame_number - 1) %
self.every_n_video_frames == 0
)
if should_encode_frame:
if self.video_mode == VideoMode.DHGR:
target_pixelmap = screen.DHGRBitmap(
main_memory=main,
aux_memory=aux,
palette=self.palette
)
else:
target_pixelmap = screen.HGRBitmap(
main_memory=main,
palette=self.palette
)

print("Starting frame %d" % self.video.frame_number)
main_seq = self.video.encode_frame(main, is_aux=False)
op_seq = self.video.encode_frame(
target_pixelmap, is_aux=self.aux_memory_bank)
self.video.out_of_work = {True: False, False: False}

if aux:
aux_seq = self.video.encode_frame(aux, is_aux=True)
if self.aux_memory_bank != last_memory_bank:
# We've flipped memory banks, start new opcode sequence
last_memory_bank = self.aux_memory_bank
op_seq = self.video.encode_frame(
target_pixelmap, is_aux=self.aux_memory_bank)

# au has range -15 .. 16 (step=1)
# Tick cycles are units of 2
tick = au * 2 # -30 .. 32 (step=2)
tick += 34 # 4 .. 66 (step=2)

(page, content, offsets) = next(
aux_seq if self.aux_memory_bank else main_seq)
(page, content, offsets) = next(op_seq)

yield opcodes.TICK_OPCODES[(tick, page)](content, offsets)

Expand Down
Loading

0 comments on commit ee66fa0

Please # to comment.