18 Commits

Author SHA1 Message Date
511986a131 updating cli.py 2020-03-12 21:30:12 +00:00
8dc88f6361 updating poetry installation instructions 2020-03-09 12:25:45 +00:00
0a77fa34fd adding poetry to installation instructions 2020-03-09 12:22:35 +00:00
0034340d63 updating comments document 2020-03-09 12:12:11 +00:00
02cb79c4b2 Merge branch 'master' into develop 2020-03-09 11:57:22 +00:00
26b346d359 Merge branch 'documentation' 2020-03-09 11:56:19 +00:00
1dae95735f updating documentation 2020-03-09 11:56:03 +00:00
78544673b4 Merge branch 'develop' 2020-03-09 11:38:49 +00:00
e8ce4b59f8 Merge branch 'documentation' into develop 2020-03-09 11:38:36 +00:00
e09728a7c7 updating documentation 2020-03-09 11:38:22 +00:00
ae6c2bf985 Merge branch 'master' into documentation 2020-03-09 03:38:06 +00:00
fd144abff0 Merge branch 'develop' 2020-03-09 03:37:17 +00:00
d499ee175e removing badge 2020-03-09 03:37:11 +00:00
fadcb98d81 Merge branch 'develop' 2020-03-09 03:34:48 +00:00
aa98102d6a removing badge 2020-03-09 03:34:36 +00:00
a10426f043 Merge branch 'develop' 2020-03-09 03:31:09 +00:00
78ac63ca36 code quality improvements 2020-03-09 03:30:52 +00:00
2d8f8dc63f Merge branch 'master' into develop 2020-03-09 03:20:26 +00:00
11 changed files with 108 additions and 44 deletions

BIN
.DS_Store vendored

Binary file not shown.

View File

@@ -10,7 +10,6 @@ Introduction
.. image:: https://img.shields.io/github/languages/code-size/dtomlinson91/musicbrainzapi-cv-airelogic?style=for-the-badge .. image:: https://img.shields.io/github/languages/code-size/dtomlinson91/musicbrainzapi-cv-airelogic?style=for-the-badge
.. image:: https://img.shields.io/github/languages/top/dtomlinson91/musicbrainzapi-cv-airelogic?style=for-the-badge .. image:: https://img.shields.io/github/languages/top/dtomlinson91/musicbrainzapi-cv-airelogic?style=for-the-badge
.. image:: https://img.shields.io/requires/github/dtomlinson91/musicbrainzapi-cv-airelogic?style=for-the-badge .. image:: https://img.shields.io/requires/github/dtomlinson91/musicbrainzapi-cv-airelogic?style=for-the-badge
.. image:: https://img.shields.io/codacy/grade/f9517450400d48b0a7222a383c2e8fe2?style=for-the-badge
Summary Summary
======== ========
@@ -71,6 +70,27 @@ In the root of the repo in a virtual environment run:
python ./setup.py install python ./setup.py install
poetry
------
Clone the repo:
.. code-block:: bash
git clone https://github.com/dtomlinson91/musicbrainzapi-cv-airelogic.git
In a virtual environment install poetry:
.. code-block:: bash
pip install poetry
In the root of the repo in a virtual environment run:
.. code-block:: bash
poetry install --no-dev
Docker Docker
------ ------

View File

@@ -53,7 +53,7 @@ Code restructure
The :class:`musicbrainzapi.api.lyrics.concrete_builder.LyricsConcreteBuilder` class could be improved. Many of the methods defined in here no longer need to be present. Some of the functionality (url checking for example) could be removed and implemented in other ways (a Mixin is one solution). The :class:`musicbrainzapi.api.lyrics.concrete_builder.LyricsConcreteBuilder` class could be improved. Many of the methods defined in here no longer need to be present. Some of the functionality (url checking for example) could be removed and implemented in other ways (a Mixin is one solution).
If other ways of filtering were to be added (as opposed to the current default of just Albums) this class would be useful in constructing our :class:`musicbrainzapi.api.lyrics.Lyrics` objects consistently. If other ways of filtering were to be added (as opposed to the current default of just Albums) then this class would be useful to build our :class:`musicbrainzapi.api.lyrics.Lyrics` objects consistently.
Additional functionality to the lyrics command Additional functionality to the lyrics command
----------------------------------------------- -----------------------------------------------
@@ -68,7 +68,7 @@ The ability for the user to specify something other than album or year to group
Multiple artists Multiple artists
^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
Searching for multiple artists and comparing is certainly possible in the current iteration (click provides a nice way to accept multiple artists and then we create our ``Lyrics`` objects from these) this wasn't implemented. There are rate limiting factors which may slow down the program and increase runtime considerably. Searching for multiple artists and comparing is certainly possible in the current iteration (click provides a nice way to accept multiple artists and then we create our ``Lyrics`` objects from these) this wasn't implemented. There are rate limiting factors which may slow down the program and in the current implementation it could increase runtime considerably.
Speed improvements Speed improvements
------------------- -------------------
@@ -79,7 +79,7 @@ One solution would be to implement threading - as we are waiting on HTTP request
This wasn't implemented primarily because of time - but threading could be implemented on each call we make to the API. This wasn't implemented primarily because of time - but threading could be implemented on each call we make to the API.
An alternative, and I beleive an interesting solution, would be to use AWS Lambda (serverless). An alternative, and I believe an interesting solution, would be to use AWS Lambda (serverless).
There is a caveat to this solution and it is cost - threading is free but adds development time and increases complexity. AWS isn't free but allows you to scale the requests out. There is a caveat to this solution and it is cost - threading is free but adds development time and increases complexity. AWS isn't free but allows you to scale the requests out.
@@ -98,4 +98,29 @@ If more control was needed one solution could be:
This requires the user to have an internet connection - which is a current requirement. Requests to the api could be made simultaneously - without adding the complexity that comes with threading. This would not solve any API rate limiting - we are required to provide an application user_agent to the api to identify the app. This requires the user to have an internet connection - which is a current requirement. Requests to the api could be made simultaneously - without adding the complexity that comes with threading. This would not solve any API rate limiting - we are required to provide an application user_agent to the api to identify the app.
An interesting solution, and one I did consider, was to have the program run entirely in lambda, requiring no depdencies and just a simple front end that sends requests, and uses ``boto3`` to retrieve. The simplicity of this, and the fact that AWS provide an SDK for many languages, means the cient code could run in any language.
An interface to AWS API Gateway would provide the entry point to the lambda.
Writing it in this manner (with an api backend) would mean a webapp of the program could be possible, with the frontend served with something like ``Vuejs`` or ``React``.
.. _Zappa: https://github.com/Miserlou/Zappa .. _Zappa: https://github.com/Miserlou/Zappa
Error catching
--------------
Handling missing data from both APIs is done with error catching (namely ``ValueError`` and ``TypeError``).
Although inelegant, and not guaranteed to capture the specific behaviour we want to catch (missing data etc.) it is a solution and appears to work quite well.
Musicbrainz provides a schema for their api. If this were to be placed in a production environment then readdressing this should be a priority - we should be checking the values returned, using the schema as a guide, and replacing missing values accordingly. We should not rely on ``try except`` blocks to do this as it can be unreliable and is prone to raise other errors.
Further statistical analysis
----------------------------
Standard descriptive statistics are provided. I did consider including a more deeper analysis but opted not to for several reasons:
- Without a specific problem or question to answer - explorative work can take a lot of time and may not yield satisfactory results. Questions I did consider are:
+ `For active artists, based on their previous lyrics count what is the predicition of their next album?` Although a sensible question I'm not sure how useful the predicition would be - I am sure for some artists they would follow a pattern over time, but I'm not convinced all artists would and I imagine the results would be mixed.
+ `Anomaly detection - for artists with large releases, what albums stood out as larger than usual and what feature (or track) caused this anomaly?` - This would be a good question to answer and we have many tools available. As we have numeric data - clustering could be a candidate (DBSCAN or even K-MEANS). I opted not to because of time and the fact it would bloat the requirements up. Feature flags are an option when handling extra packages, ``pip install musicbrainzapi[analysis]`` for example, but nonetheless this would be an interesting question to answer and I beleive one of the easier ones to implement if it was desired.

View File

@@ -33,10 +33,8 @@ entry_points = \
setup_kwargs = { setup_kwargs = {
'name': 'musicbrainzapi', 'name': 'musicbrainzapi',
'version': '1.0.0', 'version': '1.0.0',
'description': '',
'long_description': None, 'long_description': None,
'description': 'Python module to calculate statistics and generate a wordcloud for a given artist. Uses the Musicbrainz API and the lyrics.ovh API.', 'description': 'Python module to calculate statistics and generate a wordcloud for a given artist. Uses the Musicbrainz API and the lyrics.ovh API.',
'long_description': '',
'author': 'dtomlinson', 'author': 'dtomlinson',
'author_email': 'dtomlinson@panaetius.co.uk', 'author_email': 'dtomlinson@panaetius.co.uk',
'maintainer': None, 'maintainer': None,

View File

@@ -1,6 +1,6 @@
""" """
musicbrainzapi: A CLI lyrics searcher musicbrainzapi: A CLI lyrics searcher.
===================================== ======================================
This module was written by dtomlinson <dtomlinson@panaetius.co.uk> for Aire Logic This module was written by dtomlinson <dtomlinson@panaetius.co.uk> for Aire Logic

View File

@@ -1,3 +1,7 @@
"""
Lyrics object with statistics.
===============================
"""
from __future__ import annotations from __future__ import annotations
from typing import Union, Dict, List from typing import Union, Dict, List
from dataclasses import dataclass from dataclasses import dataclass
@@ -10,8 +14,7 @@ import numpy as np
@dataclass @dataclass
class Lyrics: class Lyrics:
"""Lyrics object for an artist. """Lyrics object for an artist."""
"""
artist_id: str artist_id: str
artist: str artist: str

View File

@@ -106,6 +106,7 @@ class LyricsBuilder(LyricsConcreteBuilder):
------- -------
str str
URL for lyrics from the lyrics api. URL for lyrics from the lyrics api.
""" """
lyrics_api_base = 'https://api.lyrics.ovh/v1' lyrics_api_base = 'https://api.lyrics.ovh/v1'
lyrics_api_url = html.escape(f'{lyrics_api_base}/{artist}/{song}') lyrics_api_url = html.escape(f'{lyrics_api_base}/{artist}/{song}')
@@ -123,7 +124,8 @@ class LyricsBuilder(LyricsConcreteBuilder):
Returns Returns
------- -------
str str
Lyrics of the trakc Lyrics of the track.
""" """
resp = requests.get(url) resp = requests.get(url)
@@ -192,6 +194,7 @@ class LyricsBuilder(LyricsConcreteBuilder):
return _d return _d
def __init__(self) -> None: def __init__(self) -> None:
"""Create a builder instance to build a Lyrics object."""
self.reset() self.reset()
def reset(self) -> None: def reset(self) -> None:
@@ -208,8 +211,7 @@ class LyricsBuilder(LyricsConcreteBuilder):
return self return self
def sort_artists(self) -> None: def sort_artists(self) -> None:
"""Sort the artists from the Musicbrainzapi """Sort the artists from the Musicbrainzapi."""
"""
self._sort_names = dict( self._sort_names = dict(
(i.get('id'), f'{i.get("name")} | {i.get("disambiguation")}') (i.get('id'), f'{i.get("name")} | {i.get("disambiguation")}')
if i.get('disambiguation') is not None if i.get('disambiguation') is not None
@@ -241,8 +243,7 @@ class LyricsBuilder(LyricsConcreteBuilder):
return self return self
def find_all_albums(self) -> None: def find_all_albums(self) -> None:
"""Find all albums for the chosen artist """Find all albums for the chosen artist."""
"""
limit, offset, page = (100, 0, 1) limit, offset, page = (100, 0, 1)
resp_0 = addict.Dict( resp_0 = addict.Dict(
@@ -365,8 +366,7 @@ class LyricsBuilder(LyricsConcreteBuilder):
return self return self
def find_lyrics_urls(self) -> None: def find_lyrics_urls(self) -> None:
"""Construct the URL for the lyrics api. """Construct the URL for the lyrics api."""
"""
self.all_albums_lyrics_url = list() self.all_albums_lyrics_url = list()
for x in self.all_albums: for x in self.all_albums:
for alb, tracks in x.items(): for alb, tracks in x.items():

View File

@@ -10,6 +10,7 @@ class LyricsClickDirector:
"""Director for Lyrics builder.""" """Director for Lyrics builder."""
def __init__(self) -> None: def __init__(self) -> None:
"""Create a Director to orchestrate the builder."""
self._builder = None self._builder = None
@staticmethod @staticmethod
@@ -62,6 +63,7 @@ class LyricsClickDirector:
------ ------
SystemExit SystemExit
If no artist is found will cleanly quit. If no artist is found will cleanly quit.
""" """
artist_meta = None artist_meta = None
for i, j in self.builder._top_five_results.items(): for i, j in self.builder._top_five_results.items():
@@ -111,8 +113,7 @@ class LyricsClickDirector:
return self return self
def _query_for_data(self) -> None: def _query_for_data(self) -> None:
"""Query Musicbrainz api for albums + track data. """Query Musicbrainz api for albums + track data."""
"""
self.builder.find_all_albums() self.builder.find_all_albums()
self.builder.find_all_tracks() self.builder.find_all_tracks()
self.builder._product.all_albums_with_tracks = self.builder.all_albums self.builder._product.all_albums_with_tracks = self.builder.all_albums

View File

@@ -6,55 +6,64 @@ import click
from musicbrainzapi.__version__ import __version__ from musicbrainzapi.__version__ import __version__
from musicbrainzapi.__header__ import __header__ from musicbrainzapi.__header__ import __header__
CONTEXT_SETTINGS = dict(auto_envvar_prefix='COMPLEX') # pylint:disable=invalid-name
CONTEXT_SETTINGS = dict(auto_envvar_prefix="COMPLEX")
class Environment(object): class Environment:
"""Environment class to house shared parameters between all subcommands."""
def __init__(self): def __init__(self):
self.verbose = False self.verbose = False
self.home = os.getcwd() self.home = os.getcwd()
pass_environment = click.make_pass_decorator(Environment, ensure=True) pass_environment = click.make_pass_decorator(
Environment, ensure=True
)
cmd_folder = os.path.abspath( cmd_folder = os.path.abspath(
os.path.join(os.path.dirname(__file__), 'commands') os.path.join(os.path.dirname(__file__), "commands")
) )
class ComplexCLI(click.MultiCommand): class ComplexCLI(click.MultiCommand):
"""Access and run subcommands."""
def list_commands(self, ctx): def list_commands(self, ctx):
rv = [] """List all subcommands."""
for filename in os.listdir(cmd_folder): rv = [
if filename.endswith('.py') and filename.startswith('cmd_'): filename[4:-3]
rv.append(filename[4:-3]) for filename in os.listdir(cmd_folder)
if filename.endswith(".py") and filename.startswith("cmd_")
]
rv.sort() rv.sort()
return rv return rv
def get_command(self, ctx, cmd_name): def get_command(self, ctx, cmd_name):
mod = import_module(f'musicbrainzapi.cli.commands.cmd_{cmd_name}') """Get chosen subcummands."""
mod = import_module(f"musicbrainzapi.cli.commands.cmd_{cmd_name}")
return getattr(mod, cmd_name) return getattr(mod, cmd_name)
@click.command(cls=ComplexCLI, context_settings=CONTEXT_SETTINGS) @click.command(cls=ComplexCLI, context_settings=CONTEXT_SETTINGS)
@click.option( @click.option(
'-p', "-p",
'--path', "--path",
type=click.Path( type=click.Path(exists=True, file_okay=False, resolve_path=True, writable=True),
exists=True, file_okay=False, resolve_path=True, writable=True help="Local path to save any output files.",
), default=os.getcwd(),
help='Local path to save any output files.',
default=os.getcwd()
) )
# @click.option('-v', '--verbose', is_flag=True, help='Enables verbose mode.') @click.option("-v", "--verbose", is_flag=True, help="Enables verbose mode.")
@click.version_option( @click.version_option(
version=__version__, version=__version__,
prog_name=__header__, prog_name=__header__,
message=f'{__header__} version {__version__} 🎤', message=f"{__header__} version {__version__} 🎤",
) )
@pass_environment @pass_environment
def cli(ctx, path): def cli(ctx, verbose, path):
"""Base command for the musicbrainzapi program.""" """Display base command for the musicbrainzapi program."""
# ctx.verbose = verbose ctx.verbose = verbose
if path is not None: if path is not None:
click.echo(f'Path set to {os.path.expanduser(path)}') click.echo(f"Path set to {os.path.expanduser(path)}")
ctx.path = os.path.expanduser(path) ctx.path = os.path.expanduser(path)

View File

@@ -1,3 +1,7 @@
"""
Wordcloud from lyrics.
"""
from __future__ import annotations from __future__ import annotations
import collections import collections
from importlib import resources from importlib import resources
@@ -41,6 +45,8 @@ class LyricsWordcloud:
all_albums_lyrics_count: 'Lyrics.all_albums_lyrics_count', all_albums_lyrics_count: 'Lyrics.all_albums_lyrics_count',
): ):
""" """
Create a worcloud object.
Parameters Parameters
---------- ----------
pillow_img : PIL.PngImagePlugin.PngImageFile pillow_img : PIL.PngImagePlugin.PngImageFile
@@ -55,12 +61,14 @@ class LyricsWordcloud:
def use_microphone( def use_microphone(
cls, all_albums_lyrics_count: 'Lyrics.all_albums_lyrics_count', cls, all_albums_lyrics_count: 'Lyrics.all_albums_lyrics_count',
) -> LyricsWordcloud: ) -> LyricsWordcloud:
"""Class method to instantiate with a microphone base image. """
Class method to instantiate with a microphone base image.
Parameters Parameters
---------- ----------
all_albums_lyrics_count : Lyrics.all_albums_lyrics_count all_albums_lyrics_count : Lyrics.all_albums_lyrics_count
List of all albums + track lyrics counted by each word List of all albums + track lyrics counted by each word
""" """
mic_resource = resources.path( mic_resource = resources.path(
'musicbrainzapi.wordcloud.resources', 'mic.png' 'musicbrainzapi.wordcloud.resources', 'mic.png'
@@ -78,7 +86,7 @@ class LyricsWordcloud:
*args, *args,
**kwargs, **kwargs,
) -> str: ) -> str:
"""Static method to generate a random grey colour""" """Static method to generate a random grey colour."""
colour = f'hsl(0, 0%, {random.randint(60, 100)}%)' colour = f'hsl(0, 0%, {random.randint(60, 100)}%)'
return colour return colour

Binary file not shown.