Skip to content

Google Summer of Code 2019

Ayush Bhardwaj edited this page Mar 10, 2020 · 13 revisions

Continuation of Atarashi OSS

Introduction

Atarashi scans for license statements in open source software, focusing on text statistics. Designed to work stand-alone and with FOSSology. Right now it works on the simple command-line interface and separately from FOSSology. To make it more user-friendly it needs a GUI. FOSSology already has a stable GUI and advance features improved over the years. Atarashi works on Text Statistics and Information Retrieval Algorithms written purely in Python. Integrating Atarashi with FOSSology will not only make it more powerful but will also give various existing features for FOSSology to use in its User Interface. A standardized algorithm evaluator is needed for the existing and upcoming license scanning algorithms to validate its accuracy and reliability. A Machine Learning Model & Algorithm needs to be established so that it can be improved in the future. This will give us the most accurate and best results and will make Atarashi more powerful & faster than ever.

Project Goals

  • Packaging Atarashi
  • Publishing to PyPI (Currently published to TestPyPI)
  • Integrating with FOSSology using Ninka wrapper
  • Creating working UI
  • Writing Evaluation Script for Algorithms
  • Implementing New Algorithm for Atarashi [Semantic Text Similarity]
  • Writing Tests & Fixing Bugs
  • Documentation

Weekly Progress Reports

Work Flow

First Evaluation

  • Studied Atarashi Structure and Workflow
  • Learned to Publish a Python Package to PyPI
  • Created Atarashi Package
  • Published Atarashi Package to TestPyPI
  • Studying and Testing "Ninka" Wrapper (FOSSology)
  • Integrating Atarashi with FOSSology

Second Evaluation

  • Creating working UI for Atarashi
  • Discuss and Study the new Algorithm
  • Finding a method to evaluate the algorithms
  • Creating the Evaluation Script

Third Evaluation

  • Finding Dataset for New Algorithm
  • Training the model
  • Implementing the new algorithm on the trained model
  • Solving any issue for the new algorithm
  • Fixing bugs
  • Documentation

Successfully completed Google Summer of Code 2019 for FOSSology