Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Dev hystrix plugin #1594

Merged
merged 16 commits into from
Sep 5, 2016
Merged

Dev hystrix plugin #1594

merged 16 commits into from
Sep 5, 2016

Conversation

jiaqifeng
Copy link
Contributor

This plugin supports HystrixCommand from com.netflix.hystrix:hystrix-core. And show the topology of a client using HystrixCommand to call a service.
HystrixCommand is a "Latency and Fault Tolerance" for RPC call in Distributed Systems. HystrixCommand will run a RPC call in another service thread and provide circuit breaker functions. More could be found at https://github.com/Netflix/Hystrix/.

Below is screenshots showing 2 simple web app using Hystrix.

hystrix-topology
hystrix-trace

Any review advice is appreciate.

@nstopkimsk nstopkimsk added this to the 1.5.3 milestone Mar 20, 2016
@jiaqifeng
Copy link
Contributor Author

@nstopkimsk when will the 1.5.3 milestone be?

@nstopkimsk
Copy link
Contributor

@jiaqifeng Not yet defined. 1.5.2 will be released soon. after that we will discuss it.

@pparth
Copy link

pparth commented Apr 14, 2016

+1 to this. Very important.

@pparth
Copy link

pparth commented Apr 14, 2016

Is this compatible to 1.4.x hystrix version line too? Or 1.5.x only?

@jiaqifeng
Copy link
Contributor Author

@pparth I just test it with com.netflix.hystrix:hystrix-core:1.3.20 which my app used. The version is specified in HystrixCommandIT integration test. What is your interest in it? If it is reasonable I would like investigate it if I have time for it ^-^.

@jiaqifeng
Copy link
Contributor Author

@pparth Currently, this plugin only show it's a hystix call, withnot the name of the real subclass of hystrixCommand which I thought is more meaningful. I am considering to add this function too.

@pparth
Copy link

pparth commented Apr 18, 2016

Cool. No problem on my side. Just thinking that hystrix instrumentation is something that we would like to have as we make extensive use of this framework. We maintain 2 versions of our Core module, one that uses the 1.4.X line and one that uses the 1.5.X line (which has major differences that are supposed to be backward ocmpatible).

@pparth
Copy link

pparth commented Apr 20, 2016

@jiaqifeng We make extensive use of Hystrix in a micro-service architecture deployed in upwork.com. We will gladly help with testing what you prepared. Do you have an agent build for us to use?

@jiaqifeng
Copy link
Contributor Author

jiaqifeng commented Apr 21, 2016

@pparth I update code to support 1.4.x version while I just test it with a simple demo using 1.4.20 and 1.5.2. For I will be busy for next several weeks I will update more test for it really a little later. I'd like to see you to try it. Any feedback is appreciate.

@jiaqifeng
Copy link
Contributor Author

There are 2 TODOs left:
1 integration test for more hystrix versions
2 make SpanAsyncEventSimpleAroundInterceptor.getAsyncTraceId not private, then HystrixObservableCallInterceptor will not copy the all other codes.

@Xylus
Copy link
Contributor

Xylus commented Jun 28, 2016

@jiaqifeng Thanks for keeping up with this. Might be a good idea to rebase your Hystrix feature branch against the master instead of the merge commit (f833089).
You can do this by committing hystrix related code to dev-hystrix-plugin branch, then switch and update your master branch to the latest. Once done, run git rebase master dev-hystrix-plugin from the master branch. (You might want to reset the f833089 commit before you do all this)

Let me know if you have any questions.
Thanks!

@jiaqifeng
Copy link
Contributor Author

@Xylus Thank you for your advice, I will rebase it in a week. Will this PR be accepted in the next release? Or any else must be done for accepting?

@jiaqifeng jiaqifeng closed this Jun 29, 2016
@jiaqifeng jiaqifeng reopened this Jun 29, 2016
@Frusty
Copy link

Frusty commented Jun 29, 2016

@jiaqifeng
I have added your PR to my local test pinpoint installation. We use Netflix Feign as a client for frontend services to talk to backends. Feign can wrap it's calls with hystrix (which is what we want) and this works very well for us.

When I do add pinpoint with your PR, it does not seem to wrap outgoing calls with the needed headers. My Pinpoint shows calls to the backend from "USER" instead from the frontend.

The frondend call stack shows the HystrixCommand ( queue() ) but does not go deeper into the call.

screen shot 2016-06-29 at 16 15 44

Any ideas? Any logs or screenshots I can supply you with?

@jiaqifeng
Copy link
Contributor Author

@Frusty
It seems the trace does not continue in hystrix executing thread.
Which commit of my hystrix plugin are you using?
What version of HystrixCommand are you using? The hystirx's code was different for different version. Currently it doesnot work for hystirx 1.5.3 and later.
Log file when frontend using the hystrix will help a lot for I add logs for debug and you could check with log in code by your self if you like. Also, if you can provide the call stacktrace for your subclass of HystrixCommand maybe help too.

@Frusty
Copy link

Frusty commented Jul 1, 2016

@jiaqifeng
Okay, this explains it. We are currently using hystrix.1.5.3.

@jiaqifeng
Copy link
Contributor Author

@Frusty
Why do you use 1.5.3? Just for latest version?
I have updated this plugin twice for hystrix code changes. While it seems I have to update it to support 1.5.3 later.

@jiaqifeng jiaqifeng force-pushed the dev-hystrix-plugin branch from f833089 to aeaee45 Compare July 4, 2016 05:23
@jiaqifeng jiaqifeng force-pushed the dev-hystrix-plugin branch from 9de6c84 to ace1da7 Compare July 4, 2016 08:19
@jiaqifeng
Copy link
Contributor Author

@Frusty
I added support for 1.5.3. Any feedback are welcome.

@jiaqifeng
Copy link
Contributor Author

@Xylus
Why did you suggest me to rebase instead of merge? I know rebase make a more clear history than merge. Is this your prefer or the require of pinpoint dev process?

@Xylus
Copy link
Contributor

Xylus commented Jul 5, 2016

@jiaqifeng
I asked about the rebase because of the commit mentioned in my post above. This seems like it's been fixed :) We do however prefer rebasing before making a PR to make the history a bit more clear (and also squashing into smaller commits), but it's not required.

@Frusty
Copy link

Frusty commented Jul 11, 2016

@jiaqifeng
Updated to your latest commit and tested with Hystrix enabled. We now see traces across services, so this works as expected.
Thank you!

*
*/
public interface HystrixTestConstants {
public static final String VERSION = "1.5.2-SNAPSHOT";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this but should this be 1.5.2-SNAPSHOT?
You could also use Version.java if you have pinpoint-commons dependency.

@jiaqifeng
Copy link
Contributor Author

@Xylus
I have update as your advice which seems reasonable. While integration test HystrixCommand_1_5_x_IT will failed. It seems due to fail to resolve "[1.5.3,)". This is why I did not commit it before. What's the detail about how the artifact version is resolved? Could you give me some advice?
Currently hystrix-core version that could be find is 1.5.3, 1.5.3-rc.1, 1.5.3-rc.2

@Xylus
Copy link
Contributor

Xylus commented Jul 18, 2016

@jiaqifeng hmm that is weird, [1.5.3,) should be resolved into fetching 1.5.3, 1.5.3-rc.1, 1.5.3-rc.2, and everything after.
What you've done here is correct.
Are you getting No version in the given range: com.netflix.hystrix:hystrix-core:jar:[1.5.3,) error?

@Xylus
Copy link
Contributor

Xylus commented Sep 1, 2016

@jiaqifeng
Hi jiaqifeng, we're preparing to release 1.6.0 soon (in about a month?) and we'd like to get this one in. Could you resolve the conflict so we can merge them to master? Let me know if you have any issues.
Thanks!

@jiaqifeng
Copy link
Contributor Author

@Xylus
I will merge it in a week. I'd glad this be accepted.

@Xylus
Copy link
Contributor

Xylus commented Sep 5, 2016

@jiaqifeng
Awesome! Let me know if you need me to do anything.
Thanks

@jiaqifeng jiaqifeng reopened this Sep 5, 2016
@jiaqifeng
Copy link
Contributor Author

@Xylus
I had hard reset the branch to a version on master by mistaken and the PR was auto CLOSED. Thanks god I can find it back and reset to the right version.

@Xylus
Copy link
Contributor

Xylus commented Sep 5, 2016

@jiaqifeng
heh, sounds scary. Glad you sorted it out.
Is the PR ready to be merged now?

@jiaqifeng
Copy link
Contributor Author

@Xylus
Yes, this PR is ready. If there anything need to be done please let know as soon as possible.

@Xylus
Copy link
Contributor

Xylus commented Sep 5, 2016

@jiaqifeng
Awesome, will probably test it out after we merge it first. The code looks good, might need some extensive testing on our side as well :)
Thanks again for the contribution!

@Xylus Xylus merged commit 3935cc5 into pinpoint-apm:master Sep 5, 2016
@clufeng
Copy link

clufeng commented Sep 13, 2016

@jiaqifeng
We have developed Hystrix-plugin:

transformTemplate.transform("com.netflix.hystrix.HystrixCommand", new TransformCallback() {
            @Override
            public byte[] doInTransform(Instrumentor instrumentor, ClassLoader classLoader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws InstrumentException {
                InstrumentClass target = instrumentor.getInstrumentClass(classLoader, className, classfileBuffer);
                InterceptorScope scope = instrumentor.getInterceptorScope(SCOPE_NAME);
                InstrumentMethod queue = target.getDeclaredMethod("queue");
                queue.addScopedInterceptor("com.navercorp.pinpoint.plugin.hystrix.interceptor.HystrixCommandQueueInterceptor", scope);
                return target.toBytecode();
            }
        });

       transformTemplate.transform("com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction", new TransformCallback() {
            @Override
            public byte[] doInTransform(Instrumentor instrumentor, ClassLoader classLoader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws InstrumentException {
                InterceptorScope scope = instrumentor.getInterceptorScope(SCOPE_NAME);

                InstrumentClass target = instrumentor.getInstrumentClass(classLoader, className, classfileBuffer);
                target.addField("com.navercorp.pinpoint.bootstrap.async.AsyncTraceIdAccessor");

                InstrumentMethod constructor = target.getConstructor("com.netflix.hystrix.strategy.concurrency.HystrixConcurrencyStrategy", "rx.functions.Action0");
                constructor.addScopedInterceptor("com.navercorp.pinpoint.plugin.hystrix.interceptor.HystrixCommandSchedulerActionInterceptor", scope, ExecutionPolicy.INTERNAL);

                InstrumentMethod queue = target.getDeclaredMethod("call");
                queue.addInterceptor("com.navercorp.pinpoint.plugin.hystrix.interceptor.HystrixCommandExecuteCommandInterceptor");
                return target.toBytecode();
            }
        });

for hystrix 1.5.3

@jiaqifeng
Copy link
Contributor Author

@clufeng
Thank you for comments. You intercepted "HystrixCommandSchedulerActionInterceptor.call()" method which will call "HystrixCommand$1.call()" that I intercepted in my code.
When supporting hystrix of version before 1.5.3, hystrix will spend more time than run the run() of real command class. This will cause the time interval reported is longer than the real "bussiness" function. So I choose to intercept the code as near of real run() method as I could.
Any comments is welcome to improve the code!

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants