Solutions and cookies tracked in-memory #1

Open
opened 2023-06-12 10:48:26 -04:00 by cerealxp · 0 comments
Owner

Currently, we remember the correct solution for a cookie's CAPTCHA challenge using an in-memory dict. Similarly, we track cookies that have solved the CAPTCHA with an array of cookies.

captcha_solutions = {}
captcha_solved = []
# ...
    captcha_cookie = request.cookies.get('freecaptcha_cookie')
    real_answer = captcha_solutions.get(captcha_cookie, None)
    # etc, etc

(from 97c28170c9/app.py)

We could simply do the same thing using a database instead of in-memory built-in Python data structures. My initial thought was to use SQLite. However, I'd like to make it as easy as possible for the backend, so let's see if we can replace this in-memory database so that we don't need a database at all...

Proposed solution

JWTs.

Pancho, JWT's are a fantastic use-case for teaching Alfredo how public key cryptography works in a practical, very realistic context (here's a great video explaining public key crypto in 6 minutes: https://www.youtube.com/watch?v=GSIDS_lvRv4).

Basically, instead of storing a cookie in the user's browser that matches a session in our database, we just cryptographically sign a cookie in the user's browser, and we know it must be legit because it's signed with our secret key. So we only store one thing on the server - the secret key.

A lot of devs dislike JWTs. Here's why:

Anyway - so right now, we have two in-memory data stores:

  1. Solutions for CAPTCHAs
  2. Sessions for users who solved their CAPTCHA

Ideally, we don't want to store anything - everything should be done just with cryptography.

How to store CAPTCHA solutions with JWTs

This is the hard part, since we need to store the solution, and we can't give the solution to the user as part of the JWT. Or can we..?

Spoiler: we can. We just have to encrypt it. With our public key! So the initial JWT will contain the solution to the CAPTCHA, encrypted with our public key so only we can see it. When the user submits the challenge, we decrypt the solution in their JWT and compare it with the solution they've submitted!

Isn't that clever? I don't know if anyone's ever done that with a JWT before, so we won't be able to just npm install jwt-store-encrypted-secret or whatever you usually do haha (jk doggie ily bro).

Seriously though, it should be simple to roll on our own. We just need a secure public/private key mechanism to encrypt the data before sending it to the user.

Yay, now we've removed that issue! This is the hard part, so we're mostly done.

JWTs for the session

If the user submits a valid solution, we give them a JWT that doesn't expire for like an hour. Easy peasy. This is the normal way people use JWTs to replace session cookies.

The only issue is: you could have a human manually get a valid session and then just give it to their bot. Now their bot can spam the site using the valid JWT. The solution is to rate-limit users. We can't rate-limit them based on IP obviously (Tor onion services), but that's no problem - we can rate-limit requests

Tips for implementing

Currently, we remember the correct solution for a cookie's CAPTCHA challenge using an in-memory `dict`. Similarly, we track cookies that have solved the CAPTCHA with an array of cookies. ```python captcha_solutions = {} captcha_solved = [] # ... captcha_cookie = request.cookies.get('freecaptcha_cookie') real_answer = captcha_solutions.get(captcha_cookie, None) # etc, etc ``` (from https://git.lain.church/jesusvilla/FreeCAPTCHA/src/commit/97c28170c954dfbc85f9524b91d35b050bbed31f/app.py) We could simply do the same thing using a database instead of in-memory built-in Python data structures. My initial thought was to use SQLite. However, I'd like to make it as easy as possible for the backend, so let's see if we can replace this in-memory database so that we don't need a database at all... ### Proposed solution JWTs. Pancho, JWT's are a fantastic use-case for teaching Alfredo how public key cryptography works in a practical, very realistic context (here's a great video explaining public key crypto in 6 minutes: https://www.youtube.com/watch?v=GSIDS_lvRv4). Basically, instead of storing a cookie in the user's browser that matches a session in our database, we just cryptographically sign a cookie in the user's browser, and we know it must be legit because it's signed with our secret key. So we only store one thing on the server - the secret key. A lot of devs dislike JWTs. Here's why: * http://cryto.net/~joepie91/blog/2016/06/13/stop-using-jwt-for-sessions/ * https://www.howmanydayssinceajwtalgnonevuln.com * https://gist.github.com/samsch/0d1f3d3b4745d778f78b230cf6061452 Anyway - so right now, we have two in-memory data stores: 1. Solutions for CAPTCHAs 2. Sessions for users who solved their CAPTCHA Ideally, we don't want to store anything - everything should be done just with cryptography. ### How to store CAPTCHA solutions with JWTs This is the hard part, since we need to store the solution, and we can't give the solution to the user as part of the JWT. Or can we..? **Spoiler**: we can. We just have to encrypt it. With our public key! So the initial JWT will contain the solution to the CAPTCHA, encrypted with our public key so only we can see it. When the user submits the challenge, we decrypt the solution in their JWT and compare it with the solution they've submitted! Isn't that clever? I don't know if anyone's ever done that with a JWT before, so we won't be able to just `npm install jwt-store-encrypted-secret` or whatever you usually do haha (jk doggie ily bro). Seriously though, it should be simple to roll on our own. We just need a secure public/private key mechanism to encrypt the data before sending it to the user. Yay, now we've removed that issue! This is the hard part, so we're mostly done. ### JWTs for the session If the user submits a valid solution, we give them a JWT that doesn't expire for like an hour. Easy peasy. This is the normal way people use JWTs to replace session cookies. The only issue is: you could have a human manually get a valid session and then just give it to their bot. Now their bot can spam the site using the valid JWT. The solution is to rate-limit users. We can't rate-limit them based on IP obviously (Tor onion services), but that's no problem - we can rate-limit requests ### Tips for implementing * JWT debugging tool: https://jwt.io/ * PGP.py or similar for encrypting solutions in JWT payloads: https://stackoverflow.com/questions/1020320/how-to-do-pgp-in-python-generate-keys-encrypt-decrypt * Rate limiting per JWT session: https://flask-limiter.readthedocs.io/en/stable/ (although in practice, rate limiting is outside of the scope of this issue. We should create a separate issue to address the need for per-JWT rate limiting)
cerealxp added the
enhancement
label 2023-06-12 10:49:03 -04:00
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cerealxp/FreeCAPTCHA#1
No description provided.