Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[#200] Respond HTTP 204 when no more changes are available #203

Merged
merged 15 commits into from
Jan 30, 2020

Conversation

iperdomo
Copy link
Collaborator

The proxy server (nginx) will cache this response for 1h. Useful for eager clients re-requesting the same thing over and over again. The value of 1h is arbitrary and we can adjust it.

A client can bypass the proxy by using a request header Pragma: no-cache

The proxy server (nginx) will cache this response for 1h. Useful for
_eager_ clients re-requesting the same thing over and over again.
The value of 1h is arbitrary and we can adjust it.

A client can bypass the proxy by using a request header `Pragma: no-cache`
@iperdomo iperdomo requested a review from dlebrero January 27, 2020 11:19
@iperdomo iperdomo self-assigned this Jan 27, 2020
Copy link
Contributor

@dlebrero dlebrero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good. I like the approach

proxy_cache_lock on;
proxy_cache_methods GET;
proxy_cache_key $proxy_host$request_uri$akvoemail;
proxy_cache_valid 204 60m;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more in 30 to 60 secs. "Real time"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "real time" we have a way to punch a whole and bypass the cache. My intention with the 1h cache is to avoid reaching the database and other services in a transparent way. The client is not aware that we're serving cached content.

proxy_cache_methods GET;
proxy_cache_key $proxy_host$request_uri$akvoemail;
proxy_cache_valid 204 60m;
proxy_cache_bypass $http_pragma;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment regarding $request_uri. Allow the client to "cheat". Is it going to be abused?

Assuming that the cache value is 30-60 secs, what is the benefit of using it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not going to document that you can bypass and force to hit the upstream server

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that the cache value is 30-60 secs, what is the benefit of using it?

As mentioned in the previous comment, my intention is to have a larger cache time than 30-60s (for those bad behaving clients) with a escape hatch for our services.

:next-sync-url (next-sync-url (utils/get-api-root req) alias (:unilog-id changes))}))
(if (= offset (unilog/get-cursor db-spec)) ;; end of the log
(-> (status {} 204)
(header "Cache-Control" "max-age=3600"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the Nginx adding this header? (Haven't tested it)

In any case, there is some duplication here and in the nginx code regarding the caching time. Either we make the proxy_cache_valid follow this header or we remove this header.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the documentation, nginx will honor the Cache-Control header from the upstream server. I'll double check what is the behavior when both properties differ.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the Nginx adding this header? (Haven't tested it)

nginx is not adding the header, but it does cache the HTTP 204 response. I can see the instrumentation header:

< HTTP/1.1 204 No Content
< Server: openresty
< Date: Tue, 28 Jan 2020 09:24:49 GMT
< Connection: keep-alive
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Methods: GET, OPTIONS
< Access-Control-Allow-Headers: Accept,Accept-Encoding,Accept-Language,Authorization,Connection,DNT,Host,Origin,Referer,User-Agent
< X-Cache-Status: MISS
< 

[...]

< HTTP/1.1 204 No Content
< Server: openresty
< Date: Tue, 28 Jan 2020 09:25:06 GMT
< Connection: keep-alive
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Methods: GET, OPTIONS
< Access-Control-Allow-Headers: Accept,Accept-Encoding,Accept-Language,Authorization,Connection,DNT,Host,Origin,Referer,User-Agent
< X-Cache-Status: HIT

In this case the server is not sending the Cache-Control but we do have the configuration proxy_cache_valid 204 <time>

(update :data-point-deleted #(map str %)))]
(response {:changes (dissoc changes :unilog-id)
:next-sync-url (next-sync-url (utils/get-api-root req) alias (:unilog-id changes))}))
(if (= offset (unilog/get-cursor db-spec)) ;; end of the log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the third JDBC query: 1 for valid, 1 for last cursor and 1 for reading the data.

Not an issue if we open the DB connection once, but if we open/close for each, maybe we should look into avoiding 2 of those 3 calls.

Being very picky here, so feel free to ignore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • the last cursor is one code path (initial=true)
  • the valid and reading (next=true), for checking valid and reading data can be transformed into one

@@ -41,20 +41,24 @@
db-spec (-> deps :unilog-db :spec (assoc :db-name db-name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is becoming a monster. Maybe time to clean up?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed 👍

@iperdomo iperdomo requested a review from dlebrero January 29, 2020 15:27
Copy link
Contributor

@dlebrero dlebrero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I would just ask for removing the “sleep” if it is not a lot of work as it always ends up making the test flaky and slower.

Added a bunch of optional suggestions. Feel free to ignore, no need to justify why.

Thanks!

(fn [request]
(let [db-name (get-db-name (:instance-id request))
db-spec (-> unilog-db :spec (assoc :db-name db-name) event-log-spec)]
(with-open [connection (jdbc/get-connection db-spec)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: use the with-db-connection macro. The docs discourage using get-connection, plus I guess that using the macro means that we do not need to manually build the {:connection} map

ci/build.sh Outdated
./test.sh "http://localhost:8082/flow/" 2>&1 | grep 'X-Cache-Status: MISS'
./test.sh "http://localhost:8082/flow/" 2>&1 | grep 'X-Cache-Status: HIT'
./test.sh "http://localhost:8082/flow/" 2>&1 | grep 'X-Cache-Status: HIT'
./test.sh "http://localhost:8082/flow/" --header "Pragma: no-cache" 2>&1 | grep 'X-Cache-Status: BYPASS'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: I thought we agreed to remove the bypass as we have reduced the cache time to 60 secs, plus we can bypass it by using a random request param

ci/build.sh Outdated

# Checking nginx caching
(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Move out of build

ci/build.sh Outdated
(
cd nginx-auth0
docker-compose up -d
sleep 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change: No sleeps in tests please.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we just deleted the sleep? Why is no longer needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that you're checking an outdated code. This checks are not part of the build.sh anymore as you suggested.

Suggestion: Move out of build

(-> (response {:changes (dissoc changes :unilog-id)
:next-sync-url (next-sync-url api-root alias cursor)})
(header "Cache-Control" "no-cache"))))

(defn changes [deps {:keys [alias instance-id params] :as req}]
(let [{:keys [initial cursor next]} (spec/validate-params params-spec params)]
(if (and initial (or cursor next))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: change nested ifs for cond

@@ -26,35 +26,51 @@
;; TODO: Express parameter logic via spec?
(def params-spec (s/keys :opt-un [::initial ::next ::cursor]))

(def no-more-changes (-> (status {} 204) (header "Cache-Control" "max-age=60")))


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Remove empty line

@@ -147,3 +147,14 @@
(:form-instances-to-load events-2)))
data-point-changed (doall (data-point/by-ids ds (:data-point-changed events-2)))]
(assoc events-2 :form-instance-changed form-instances :data-point-changed data-point-changed))))


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: remove empty line 🤐

Instead of creating the connection and then creating a `{:connection conn}`
map when needed, we can use the with-db-connection macro
@iperdomo iperdomo requested a review from dlebrero January 30, 2020 09:31
@iperdomo
Copy link
Collaborator Author

@dlebrero gave his 👍 via another channel

@iperdomo iperdomo merged commit e69a16a into develop Jan 30, 2020
@iperdomo iperdomo deleted the issue/200-caching branch January 30, 2020 10:35
@dlebrero
Copy link
Contributor

dlebrero commented Jan 30, 2020 via email

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants