-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
merge pubsub-hp into master #1593
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… the master branch
…java into davidtorres-master
* merge from master * make pubsub high perf compile * make AckDeadlineRenewerTest pass
* use fewer guava classes in public surface - SubscriberImpl used to expose AbstractService. We just make this package-private. - SubscriberStats defines its own Stats type.
* make AutoValue work in PubSub Use error_prone_annotations so we don't clobber classpath with Java8 classes. Migrate SubscriberStats and PublisherStats to AutoValue
…ss to use containment in favor of inheritance
* Update * Fixing a race condition on shutdown
Some of our tests seem to still be flaking with timeout, eg https://travis-ci.org/GoogleCloudPlatform/google-cloud-java/jobs/185655553 Making the test not quiet should give us an idea of what's timing out.
* delete pull-related methods from PubSub Instead, provide a way to create the Subscriber object. * also delete modifyAckDeadline * google-java-format changed lines
This commit tries to deflake failures reported in https://ci.appveyor.com/project/garrettjonesgoogle/google-cloud-java/build/57 . The failure seems to come from `FakeSubscriberServiceImpl::sendError`. When calling `getAndAdvanceCurrentStream`, we waited for there to be a subscription by waiting on the `subscriptionInitialized` variable. The variable is notified before `addOpenedStream` is called though, and so it's possible that `getAndAdvanceCurrentStream` will see a list without any opened stream. The fix is to wait for at least one stream to be registered first.
* delete publish-related methods from PubSub
* make pubsub work with new gax update documentation links too
* Delete the handwritten Message type
* delete handwritten PushConfig * update code samples
* Delete handwritten PubSub layer * make Publisher/Subscriber builders public * also delete PubSubRpc
* fix doc annotations * clean up Publisher and Subscriber surface - PUBSUB_API_ADDRESS and PUBSUB_API_SCOPE are removed. We now use {Publisher,Subscriber}Settings instead. - Default settings are moved into the Builder class, so that they have package-private visibility. We can make them public later if we want to. - Subscriber doc is fixed so that it is not chopped off mid-sentence. - Subscriber doc mistakenly referred to the Publisher. This is now fixed.
* hide Publisher and Subscriber's getStats() The methods are not yet implemented, so there is no reason to expose it. Another way out is to delete it for the time being. * delete the methods
* fix improper use of iteration The previous implementation modifies a HashMap while iterating through its key set, causeing unpredictable iteration behavior. This might be the reason our tests intermittently deadlock. The new implementation uses a PriorityQueue. The time complexity is O(M * log N), where M is the number of expirations before the cut over time and N is the total number of expirations. I am not sure how this compares to O(N) intended in the previous implementation. If required, O(N) is also possible using an ArrayList. Unfortunately, a new failure has emerged. Instead of deadlocking, testModifyAckDeadline intermittently fails. Maybe - I have fixed the old bug and created a new one, - I have fixed the old bug that was masking another one, - The deadlock wasn't caused by the iteration. Now the tests just fail before they could deadlock, or some combination thereof. The incorrect iteration should be fixed regardless. * bug fix don't process the same job twice record next expiration after the loop
This is a follow up to #1552 . The linked PR has a race condition: If the ExtensionJobs is added to the queue after the alarm is set up, it is possible that the alarm would fire before the jobs are added. Fixing this is easy, just set up the alarm after. However, doing this consistently deadlocks the tests. Why? The tests uses a fake executor. In tests, time does not flow naturally, but is forced to increment, usually by the executor's `advanceTime` method. There is a race between the test thread advancing the time and the mock server thread inserting more tasks to the fake executor. If the tasks get inserted first, all is well. Otherwise, `advanceTime` doesn't see the tasks, and they do not get executed. The fix is to check the "current time" every time a task is inserted. If the task is inserted "in the past", we run the task immediately. Doing this still deadlocks the tests. Why? The fake executor needs to lock the task queue when registering a task. If the task is inserted in the past, it also needs to run the task. Running the task might involve sending a requst to the mock server. A GRPC thread on the mock server might handle the request by adding more tasks to the executor. The executor's queue is locked by the first thread, resulting in a deadlock. The fix is to lock the queue just long enough to retrieve a task, then execute the task without the lock.
* fix race in TestReceiver If TestReceiver is set so that messages must be explicitly acked, test code acks a message by calling replyNextOutstandingMessage. There is a race between calling the function and the message being delivered. Previously, if the function gets called before the message is delivered, the test fails since polling an empty deque returns a null pointer. This commit makes us wait until a message becomes available instead.
* fix race in mock publisher FakePublisherServiceImpl::publish has a race. If the call to publish happen before a response can be placed, the server will try to respond with a null object. The fix is to either - always set the response before calling - make publish wait for the response For propriety, this commit does both. This fix reveals another flake. Publisher uses exponential backoff with jitter. The jitter randomly picks a number between 0 and a maximum. If we pick low values too many times, it will retry too often and the server will run out of canned transient errors to respond back with. The test still passed since it expected any Throwable. This commit fixed the test to expect FakeException, and use a fake "random" number generator to ensure we don't run out of return values. Retrying can still causes random test failures, independently of above changes. If a request fails due to DEADLINE_EXCEEDED, the future is completed with a corresponding error. However, the last RPC might not have been successfully cancelled. When a new test starts, it gives canned response to the server. The server might use some of these responses to respond to RPCs of previous tests. Consequently, a misbehaving test can fail every test that comes after it. This commit changes the test setup code so that it creates a new fake server for every test to avoid this problem.
* return Name objects instead of plain String
* use ChannelProvider instead of ChannelBuilder * document that ChannelProvider should normally create new channels
* fix deadlock in testStreamAckDeadlineUpdate Fixes #1577. The detailed description of the bug is available in the issue. This commit fixes the bug by ensuring that the Future received by MessageReceiver cannot be completed before a callback is added to it.
This bug manifests as testBundleAcks deadlocking. The test 1. publishes a series of messages to a mock server, 2. waits to receive them back, 3. and acknowledges them. For performance reasons, the client does not immediately send the acknowledgement request to the server. Instead, it sends acks in batch every 100ms. Using a fake clock, the test advances the time by 100ms, sending the ack-batch, then verify that the mock server receives the acks. The bug is in step 2. The test thread waits by waiting on a CountDownLatch, which is counted down by GRPC thread calling receiveMessage(). However, the method decrements the latch before acking the message. On a bad day, the test thread can wake up, advance the clock, and send the ack-batch before the GRPC thread could add the message to the batch. The test then waits for the server to receive an ack it never sent, deadlocking the test. The fix is for receiveMessage() to ack the message before decrementing the counter.
* Update version to 0.8.1-SNAPSHOT (#1467) Also, update versions in README to 0.8.0 * Add link to Maven Central for maven-central badge. (#1468) Used to link to the image, which wasn't super useful. * fix more races in pubsub tests Previously BlockingProcessStreamReader has a terminate() method, used to tell the Reader to stop reading from the emulator process. This causes an inter-process race. If the Reader stops before reading emulator's output, the emulator process will hang as it tries to write to stdout/stderr as there's no one to read from the other side of the pipe. Since there is no way to safely stop the Reader, this commit deletes the method and its associated test. Additionally, the timeout for LocalSystemTest is increased to 3 minutes, since the emulator, somehow, consistently takes just longer than a minute to shut down. * Regenerating SPI layer (#1501) * Converting Error Reporting and Monitoring to use resource name types * Removing formatX/parseX methods from pubsub, converting usage of the same to resource name types * New methods in Logging and PubSub * Updating grpc dependency to 1.0.3 (#1504) * Release 0.8.1 (#1512) * Fix code snippet (wrong method name) in README.md Original code snippet in _"Querying data"_ section: `..while (!queryResponse.jobComplete()) {..` This results in a compile error: _"Cannot resolve method jobComplete()"_ The correct method is `jobCompleted()` * Updating version in README files. [ci skip] * Update version to 0.8.2-alpha-SNAPSHOT * Allow path in URIs passed to newFileSystem (#1470) * This makes it easier for users who start with a URI describing a full path to get a FileSystem that can work with that path, since they no longer have to needlessly remove the path from the URI. Note that Oracle's description of newFileSystem [1] puts no restriction on the passed URI. [1] https://docs.oracle.com/javase/7/docs/api/java/nio/file/FileSystems.html#newFileSystem(java.net.URI,%20java.util.Map) * Preventing logging re-entrance at FINE level (#1523) * Preventing logging re-entrance at FINE level Also: * Reducing the scope of synchronized blocks * Removing logger exclusions except for Http2FrameLogger * Add a PathMatcher for CloudStorageFileSystem (#1469) Add a test, as well. We reuse the default PathMatcher since it does a fine job of globbing files. * Set timestamp from LogRecord (#1533) * Initialize the default MonitoredResource from a GAE environment (#1535) * BigQuery: Add support to FormatOptions for AVRO #1441 Added new constant in FormatOptions, and a corresponding factory method. Updated test cases. Confirmed that AVRO does not require special treatment (like CSV does), so no additional changes are required. * Reverting changed commited by mistake before review. * BigQuery: Add support to FormatOptions for AVRO #1441 Added new constant in FormatOptions and a corresponding factory method. Updated test cases. Confirmed that AVRO does not require special treatment (like CSV does), so no additional changes are required. * use RpcFuture and remove old BundlingSettings (#1572) * use RpcFuture and remove old BundlingSettings * Release 0.8.2 Note: This version was accidentally released to Sonatype because of experimenting with deployment commands, combined with the fact that autoReleaseAfterClose is set to true. Since releases can't be taken back, we might as well own up to the release and push the code forward. * Updating version in README files. [ci skip] * Fixing javadoc error in GaeFlexLoggingEnhancer (#1582) * get tests to compile and pass
This is the same race as #1495 but in sendStreamResponse rather than sendError.
Take care of any items I mentioned that are trivial to fix, and file issues for ones that are not trivial & fix them after merging. |
google-cloud-pubsub/pom.xml
Outdated
@@ -54,6 +54,36 @@ | |||
<version>${grpc.version}</version> | |||
</dependency> | |||
<dependency> | |||
<groupId>com.google.guava</groupId> | |||
<artifactId>guava</artifactId> | |||
<version>20.0</version> |
google-cloud-pubsub/pom.xml
Outdated
google-cloud-pubsub/pom.xml
Outdated
google-cloud-pubsub/pom.xml
Outdated
/** | ||
* A barrier kind of object that helps to keep track and synchronously wait on pending messages. | ||
*/ | ||
class MessagesWaiter { |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
@garrettjonesgoogle PTAL |
LGTM |
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.