Interactive protein space visualization web components for bioinformatics research.
Note: This is a reimplementation of the ProtSpace tool described in the paper "ProtSpace: A Tool for Visualizing Protein Space" (Journal of Molecular Biology, 2025).
ProtSpace provides interactive visualization of protein embeddings from protein Language Models (pLMs) through web components. The tool enables researchers to explore protein relationships in 2D/3D space using embeddings from models like ProtT5, with support for:
- Canvas-based scatter plots with D3.js for high-performance rendering
- Interactive legends with filtering and selection
- Apache Arrow data loading for efficient data handling
- Customizable styling and theming
- Framework-agnostic web components that work with any JavaScript framework
This monorepo uses Turborepo for efficient build caching and parallel execution:
protspace/
βββ packages/
β βββ core/ # @protspace/core - Web components
β β βββ src/
β β β βββ components/
β β β β βββ scatterplot/ # Main scatter plot component
β β β β βββ legend/ # Interactive legend component
β β β β βββ data-loader/ # Arrow data loading component
β β β βββ styles/ # CSS and theming
β β β βββ shared/ # Shared utilities and types
β β β βββ index.ts
β β βββ package.json
β β
β βββ utils/ # @protspace/utils - Shared utilities
β β βββ src/
β β β βββ arrow/ # Apache Arrow parsing
β β β βββ math/ # Clustering, dimensionality reduction
β β β βββ visualization/ # Color schemes, scales
β β β βββ index.ts
β β βββ package.json
β β
β βββ react-bridge/ # @protspace/react - React wrappers
β βββ src/
β β βββ ScatterplotWrapper.tsx
β β βββ LegendWrapper.tsx
β β βββ hooks/ # React hooks
β β βββ index.ts
β βββ package.json
β
βββ apps/
β βββ src/
β βββ app/ # Next.js 14 app router
β βββ components/ # Existing components (migration) and app components
β βββ package.json
β
βββ examples/ # Standalone examples
β βββ scatterplot-vite/ # Scatterplot example using Vite
β βββ index.html
β βββ src/main.ts
β βββ package.json
β βββ vite.config.ts
β
βββ turbo.json # Turborepo configuration
βββ package.json # Root package scripts
βββ pnpm-workspace.yaml # Workspace configuration
- Node.js 18+
- pnpm 9+ (recommended) or npm
- Git
# Clone the repository
git clone https://github.com/your-username/protspace.git
cd protspace
# Install dependencies
pnpm install
# Build all packages
pnpm build
# Start development
pnpm dev
The Next.js demo app will be available at http://localhost:3000
.
The main web components package built with Lit:
<protspace-scatterplot>
- Interactive scatter plot with D3.js canvas rendering<protspace-legend>
- Customizable legend with filtering<protspace-data-loader>
- Apache Arrow data loading and parsing
<!-- Vanilla JavaScript usage -->
<protspace-scatterplot
data="path/to/data.arrow"
theme="scientific">
</protspace-scatterplot>
Shared utilities for data processing and visualization:
- Arrow parsing - Efficient loading of protein data
- Math utilities - Clustering, PCA, UMAP implementations
- Visualization helpers - Color schemes, scales, layouts
React wrappers and hooks for seamless React integration:
import { ScatterplotWrapper, useProtspaceData } from '@protspace/react';
function MyComponent() {
const { data, selectPoint } = useProtspaceData(proteinData);
return (
<ScatterplotWrapper
data={data}
onPointSelect={selectPoint}
theme="dark"
/>
);
}
# Development
pnpm dev # Start all packages in dev mode
pnpm dev:app # Start only the Next.js app
pnpm dev:core # Start only core components
# Building
pnpm build # Build all packages
pnpm build:core # Build only core components
# Testing
pnpm test # Run all tests
pnpm test:watch # Run tests in watch mode
# Linting & Formatting
pnpm lint # Lint all packages
pnpm format # Format code with Prettier
pnpm type-check # TypeScript type checking
Turborepo allows you to work efficiently with specific packages:
# Work on core components only
turbo dev --filter=@protspace/core
# Build and test React bridge
turbo build test --filter=@protspace/react
# Work on Next.js app with its dependencies
turbo dev --filter=nextjs-app
# Work on everything except demo
turbo dev --filter=!demo
-
Start development mode:
pnpm dev
This starts all packages in watch mode with hot reloading.
-
Make changes to components in
packages/core/src/components/
-
See changes reflected immediately in the Next.js app at
localhost:3000
-
Test your changes:
pnpm test
-
Build for production:
pnpm build
ProtSpace components use CSS Custom Properties for flexible theming:
/* Custom theme */
protspace-scatterplot {
--protspace-bg-primary: #1a202c;
--protspace-text-primary: #f7fafc;
--protspace-point-size: 6px;
--protspace-selection-color: #63b3ed;
}
Built-in themes:
light
- Clean light themedark
- Dark mode optimizedscientific
- Publication-ready stylingcolorblind
- Accessibility optimized
ProtSpace uses Apache Arrow for efficient data loading:
interface DataPoint {
x: number; // X coordinate
y: number; // Y coordinate
category?: string; // Protein family/category
color?: string; // Custom color
size?: number; // Point size
label?: string; // Display label
metadata?: object; // Additional data
}
Example Arrow file structure:
x
,y
- Embedding coordinates (e.g., from UMAP/PCA)protein_id
- UniProt ID or identifierfamily
- Protein family classificationorganism
- Taxonomic informationfunction
- Functional annotation
The project uses Vitest for testing:
# Run all tests
pnpm test
# Run tests in watch mode
pnpm test:watch
# Run tests with UI
pnpm test:ui
# Test specific package
turbo test --filter=@protspace/core
- Components: See
docs/components/
for detailed component APIs - Examples:
examples/scatterplot-vite/
: Scatterplot demo using Vite. Run withpnpm dev:example:scatterplot-vite
.- Check
examples/
for other usage examples (as they are added).
- Migration Guide:
docs/migration-guide.md
for migrating from existing implementations
This monorepo uses Changesets for version management:
# Create a changeset
pnpm changeset
# Version packages
pnpm version-packages
# Publish to npm
pnpm release
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Make your changes
- Add tests for new functionality
- Run tests:
pnpm test
- Create a changeset
pnpm changeset
- Commit changes:
git commit -m 'Add amazing feature'
(Note on Commit Messages) - Push to branch:
git push origin feature/amazing-feature
- Open a Pull Request
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
- Lit for web components framework
- D3.js for visualization capabilities
- Apache Arrow for efficient data handling
- Turborepo for monorepo tooling
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: GitHub Wiki
We use Storybook for developing and showcasing UI components in isolation. This is particularly useful for the web components in the @protspace/core
package.
Storybook allows you to:
- View components in different states and with various props.
- Interact with components dynamically through controls.
- Get an overview of the available components and their usage.
To start Storybook for the @protspace/core
components:
# Navigate to the core package if you aren't already there (optional)
# cd packages/core
# Run the storybook script from the root or within packages/core
pnpm --filter @protspace/core storybook
This will typically open Storybook in your browser at http://localhost:6006
.
Stories are defined in *.stories.ts
files alongside the components (e.g., packages/core/src/components/scatterplot/scatterplot.stories.ts
). Each story represents a specific state or use case of a component.
This project uses Changesets to manage versioning, changelogs, and publishing of packages. Changesets help ensure that all changes are properly documented and that package versions are incremented correctly based on the semver specification.
When you make a change to a package that you intend to be part of a release, you should add a "changeset" file. This file captures your intent:
- Which packages are affected.
- Whether the change is a
patch
,minor
, ormajor
update for each affected package. - A short description of the change, which will be used to compile the changelog.
-
After making your code changes, run the following command:
pnpm changeset add
-
Changesets will then prompt you to:
- Select which packages have been changed (use spacebar to select, enter to confirm).
- Specify the semver bump type (patch, minor, major) for each selected package.
- Write a summary of the changes. This summary will be included in the changelogs.
This process will create a new markdown file in the
.changeset
directory at the root of the project. -
Commit this generated markdown file along with your code changes.
When it's time to release:
-
Versioning: The release process will consume all changeset files to determine the new versions for the packages.
pnpm changeset version # or npx changeset version
This command updates the
package.json
versions of the changed packages and updates their changelog files (e.g.,CHANGELOG.md
). -
Publishing: After versions are bumped and changelogs are updated, you can publish the packages.
pnpm changeset publish # or npx changeset publish
This will publish the packages that have been updated in the
version
step to the configured NPM registry.(Typically, these
version
andpublish
steps are part of an automated CI/CD release pipeline after merging changes to a main branch.)
- Add changesets early and often: It's best to add a changeset as part of the same commit where the actual code changes are made.
- Be descriptive: Good changeset messages lead to good changelogs.