Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Optimize getting Price data #14

Open
scientes opened this issue Mar 2, 2021 · 13 comments · May be fixed by #16
Open

Optimize getting Price data #14

scientes opened this issue Mar 2, 2021 · 13 comments · May be fixed by #16
Assignees
Labels
help wanted Extra attention is needed PriceData

Comments

@scientes
Copy link
Contributor

scientes commented Mar 2, 2021

Currently the price is calculated by getting the data for 60s from binance and then an average is made from these 60s.
Thus for every trade 3 calls are made to binance if i'm correct (buy/sell/fee) wouldn't it be easier to get the data for a larger timeframe for some coins (e.g bnb for fees and comissions) generate a lot of small transactions within a time frame.

I do not know when exactly the prices are calculated but a simple idea from my side:

1.group transactions of one coin in timeranges which are under one hour (max range on binance)
2.download the needed timerange
3.select the 60s needed for the calculation of the prices and calculate the prices and save to the db
4. maybe cache the api timerange if needed again (dunno if useful)

this could reduce the amount of api calls from for ex. bnb from 1 per entry to about 1 api request per 5-6 entries for bnb

if you have many entries (for ex 8000, around half or more are bnb,mostly due to comissions and fees) it would take around 5-6hours for the prices to load (2s per call) and if it would be possible to optimize that with stated algorithm it would maybe only take 3h)

if there is a better way to get this data would also be good for example get the ohlcv data from binance for the 1m timerange, cctx is a good lib for that, it works with many exhanges or just use: https://binance-docs.github.io/apidocs/spot/en/#kline-candlestick-data . on binance you can request 1000 1min candles with a single request if i'm not wrong, which would be alto faster.

@provinzio
Copy link
Owner

Good idea, I thought about this myself. I would love a PR on this.

@provinzio provinzio added the help wanted Extra attention is needed label Mar 2, 2021
@provinzio provinzio changed the title Optimize getting Price data from binance Optimize getting Price data Mar 2, 2021
@scientes
Copy link
Contributor Author

scientes commented Mar 2, 2021

Ill try to implement it. Via forks or Branches? What Do you preferr?

@provinzio
Copy link
Owner

Forks

@scientes
Copy link
Contributor Author

scientes commented Mar 2, 2021

are self.books.operations sorted by date?

@provinzio
Copy link
Owner

No. Just inserted in order of the different files.
They get sorted before the evaluation in the class Taxman. But we could sort earlier.

@scientes
Copy link
Contributor Author

scientes commented Mar 2, 2021

My changes are made in the prices class and that is called in taxman, so if it is sorted before Evaluation it should be fine.

@provinzio
Copy link
Owner

provinzio commented Mar 3, 2021

But they are only sorted locally. Have a look at the code.

operations = sorted(operations, key=lambda op: op.utc_time)

@scientes scientes linked a pull request Mar 5, 2021 that will close this issue
@mastershaig
Copy link

Hi guys,

I'm not sure if below issue is related to this issue or not, anyways I wrote, please tell me if I've to create a separate issue for this one. Recently I tested the CoinTaxman with my Kraken trading history and I saw that script's price data isn't same as the real trading one. Because it gets price closest to the given timestamp and most of the time this is not precise.

For example, I sold dogecoin for 0.400000, but price data gets price as 0.401000 from kraken api and that's why it calculates my gains in the end a bit wrongly.

I'm just curious and have some questions about this issue, how to fix it and overall your opinions why you used certain methods:

  • Does this PR fix this issue as well ?
  • Why you get price information from source (Kraken/Binance) not from CSV file ? I know price is not written obviously in the csv but it's possible to get the price from csv somehow?
  • Is it good idea to filter received price api results for both closest timestamp and volume of the trading ? I mean to find the exact transaction from api ? This way you can get the price precisely, right ?

Many thanks!

@provinzio
Copy link
Owner

provinzio commented May 31, 2021

Does this PR fix this issue as well ?

Not the scope of this issue. This issue is just about getting prices from the API regardless of the CSV input.

You might want to create a new issue for this and adjust the implementation of the Book class.

Why you get price information from source (Kraken/Binance) not from CSV file ? I know price is not written obviously in the csv but it's possible to get the price from csv somehow?

Might have been slipped through. When it's possible, I think it is the best idea to get it directly from the csv file.

Is it good idea to filter received price api results for both closest timestamp and volume of the trading ? I mean to find the exact transaction from api ? This way you can get the price precisely, right ?

I am not sure about this. If the information is there, this might be possible. But most of the time it's not, because of missing information. For example, (really old) historical prices are most of the time not available as single transactions, but just as 1-minute-candles or "bigger".

@Griffsano
Copy link
Contributor

Griffsano commented Dec 13, 2021

To address the problem stated above, we could evaluate the price from the CSVs, store it in the operation, and later only fetch the API price data if we don't have it in the right fiat (EUR)?
Example ETH/EUR: Store ETH/EUR price data from CSV, this is all we need.
Example ETH/BTC: Store ETH/BTC price data from CSV, later fetch BTC/EUR price data from the API. Then either calculate ETH/EUR or also fetch this from the API.
Example ETH/USD: Same as ETH/BTC, but here I would rather only fetch USD/EUR from the API and calculate ETH/EUR with this.

@scientes
Copy link
Contributor Author

you seem like my second brain^^ my solution is/was very similar

@Griffsano
Copy link
Contributor

Griffsano commented Dec 13, 2021

xD Let me know when/where I can help.
Do you plan to do this approach within the same pull request or as a new issue? Since the price data has to be evaluated in each CSV read-in function, it might be better to merge the branches first to reduce merge conflicts?

Edit: #96

@scientes
Copy link
Contributor Author

To address the problem stated above, we could evaluate the price from the CSVs, store it in the operation, and later only fetch the API price data if we don't have it in the right fiat (EUR)? Example ETH/EUR: Store ETH/EUR price data from CSV, this is all we need. Example ETH/BTC: Store ETH/BTC price data from CSV, later fetch BTC/EUR price data from the API. Then either calculate ETH/EUR or also fetch this from the API. Example ETH/USD: Same as ETH/BTC, but here I would rather only fetch USD/EUR from the API and calculate ETH/EUR with this.

this will be almost seperate from my current pr and i will make a seperate pr for that. when that is merged i only need to edit my current pr so it also uses price data already on disk (should not be that hard)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
help wanted Extra attention is needed PriceData
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants