-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
rpyc consume cpu on "big" data #329
Comments
Some initial data
generated by |
Confirmed that increasing
Test to show this from __future__ import print_function
import sys
import pickle # noqa
import timeit
import rpyc
import unittest
from nose import SkipTest
import cfg_tests
try:
import pandas as pd
import numpy as np
except Exception:
raise SkipTest("Requires pandas, numpy, and tables")
DF_ROWS = 2000
DF_COLS = 2500
class MyService(rpyc.Service):
on_connect_called = False
on_disconnect_called = False
def on_connect(self, conn):
self.on_connect_called = True
def on_disconnect(self, conn):
self.on_disconnect_called = True
def exposed_write_data(self, dataframe):
rpyc.classic.obtain(dataframe)
def exposed_ping(self):
return "pong"
class TestServicePickle(unittest.TestCase):
"""Issues #323 and #329 showed for large objects there is an excessive number of round trips.
This test case should check the interrelations of
+ MAX_IO_CHUNK
+ min twrite
+ occurrence rate of socket timeout for other clients
"""
config = {}
def setUp(self):
self.cfg = {'allow_pickle': True}
self.server = rpyc.utils.server.ThreadedServer(MyService, port=0, protocol_config=self.cfg.copy())
self.server.logger.quiet = False
self.thd = self.server._start_in_thread()
self.conn = rpyc.connect("localhost", self.server.port, config=self.cfg)
self.conn2 = rpyc.connect("localhost", self.server.port, config=self.cfg)
# globals are made available to timeit, prepare them
cfg_tests.timeit['conn'] = self.conn
cfg_tests.timeit['conn2'] = self.conn2
cfg_tests.timeit['df'] = pd.DataFrame(np.random.rand(DF_ROWS, DF_COLS))
def tearDown(self):
self.conn.close()
self.server.close()
self.thd.join()
cfg_tests.timeit.clear()
def test_dataframe_pickling(self):
# the proxy will sync w/ the pickle handle and default proto and provide this as the argument to pickle.load
# By timing how long w/ out any round trips pickle.dumps and picke.loads takes, the overhead of RPyC protocol
# can be found
rpyc.core.channel.Channel.COMPRESSION_LEVEL = 1
#rpyc.core.stream.SocketStream.MAX_IO_CHUNK = 65355 * 10
level = rpyc.core.channel.Channel.COMPRESSION_LEVEL
max_chunk = rpyc.core.stream.SocketStream.MAX_IO_CHUNK
repeat = 3
number = 1
pickle_stmt = 'pickle.loads(pickle.dumps(cfg_tests.timeit["df"]))'
write_stmt = 'rpyc.lib.spawn(cfg_tests.timeit["conn"].root.write_data, cfg_tests.timeit["df"]); [cfg_tests.timeit["conn2"].root.ping() for i in range(30)]'
t = timeit.Timer(pickle_stmt, globals=globals())
tpickle = min(t.repeat(repeat, number))
t = timeit.Timer(write_stmt, globals=globals())
twrite = min(t.repeat(repeat, number))
headers = ['sample', 'tpickle', 'twrite', 'bytes', 'level', 'max_chunk'] # noqa
data = [repeat, tpickle, twrite, sys.getsizeof(cfg_tests.timeit['df']), level, max_chunk]
data = [str(d) for d in data]
print(','.join(headers), file=open('/tmp/time.csv', 'a'))
print(','.join(data), file=open('/tmp/time.csv', 'a'))
if __name__ == "__main__":
unittest.main() |
For now, the improvements made should be sufficient to close this issue. Other optimizations aren't specific to this issue. |
* Added warning to _remote_tb when the major version of local and remote mismatch (tomerfiliba-org#332) * Added `include_local_version` to DEFAULT_CONFIG to allow for configurable security controls (e.g. `include_local_traceback`) * Update readme.txt * Added break to client process loop when everything is dead * Increased chunk size to improve multi-client response time and throughput of large data tomerfiliba-org#329 * Improved test for response of client 1 while transferring a large amount of data to client 2 * Cleaned up coding style of test_service_pickle.py * Updated issue template * added vs code testing cfgs; updated gitignore venv * Changed settings.json to use env USERNAME * Name pack casted in _unbox to fix IronPython bug. Fixed tomerfiliba-org#337 * Fixed netref.class_factory id_pack usage per tomerfiliba-org#339 and added test cases * Added .readthedocs.yml and requirements to build * Make OneShotServer terminates after client connection ends * Added unit test for OneShotServer. Fixed tomerfiliba-org#343 * Fixed 2.6 backwards incompatibility for format syntax * Updated change log and bumped version --- 4.1.1 * Added support for chained connections which result in netref being passed to get_id_pack. Fixed tomerfiliba-org#346 * Added tests for get_id_pack * Added a test for issue tomerfiliba-org#346 * Corrected the connection used to inspect a netref * Refactored __cmp__ getattr * Extended rpyc over rpyc unit testing and removed port parameter from TestRestricted * Added comment explaining the inspect for intermediate proxy. Fixed tomerfiliba-org#346 * Improved docstring for serve_threaded to address when and when not to use the method. Done tomerfiliba-org#345 * Release 4.1.2 * Fixed versions referred to in security.rst * link docs instead of mitre * set up logging with a better formatter * fix bug when proxy context-manager is being exited with an exception (#1) * logging: add a rotating file log handler * fix bug when proxy context-manager is being exited with an exception (#1) * logging: add a rotating file log handler
Environment
Minimal example
Server:
Client:
but it works fine:
passed too:
The text was updated successfully, but these errors were encountered: