Gather POST request on HTTPS traffic?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Gather POST request on HTTPS traffic?

Roeeklinger60
Hello everyone,

I work at a digital agency that has quite a few machines that are managing some Instagram accounts, they are all running in the same LAN and we are using Squid as a proxy to log and analyze some usage statistics and to make sure the machines are only used for Instagram.

We had an idea to use Squid to capture the POST data of users on the proxy level, for example, likes, follows, comments, etc so we can log and analyze everything in a convenient central way, so we can analyze it and even send out clients a monthly report of all the actions their accounts made (who they followed, what they liked, etc).

I can easily see the requests that I want to capture inside the "network" tab in Chrome but the problem is that Instagram uses HTTPS, so I can't seem to be able to capture this data.


Is there any way for me to log this data via Squid using the POST data or any other way?


Note: We are aware of the legal issues, all machines connected to the network are company property, and all the accounts are client accounts that allow us to gather statistics. No personal account data will be gathered.


_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Gather POST request on HTTPS traffic?

Amos Jeffries
Administrator
On 17/11/20 12:14 pm, roee klinger wrote:

> Hello everyone,
>
> I work at a digital agency that has quite a few machines that are
> managing some Instagram accounts, they are all running in the same LAN
> and we are using Squid as a proxy to log and analyze some usage
> statistics and to make sure the machines are only used for Instagram.
>
> We had an idea to use Squid to capture the POST data of users on the
> proxy level, for example, likes, follows, comments, etc so we can log
> and analyze everything in a convenient central way, so we can analyze it
> and even send out clients a monthly report of all the actions their
> accounts made (who they followed, what they liked, etc).
>
> I can easily see the requests that I want to capture inside the
> "network" tab in Chrome but the problem is that Instagram uses HTTPS, so
> I can't seem to be able to capture this data.
>
>
> Is there any way for me to log this data via Squid using the POST data
> or any other way?
>

Access to HTTPS transactions for a domain you do not own requires the
SSL-Bump feature to decrypt ("bump") the TLS layer.
  see <https://wiki.squid-cache.org/Features/SslPeekAndSplice>.

You could use cache.log with "debug_options ALL,1 11,2" configured to
log the transactions. However an ICAP service or eCAP module that does
both the record and analyze for you is probably better.


>
> Note: We are aware of the legal issues, all machines connected to the
> network are company property, and all the accounts are client accounts
> that allow us to gather statistics. No personal account data will be
> gathered.


Please be aware:
   That statement conflicts with the stated purpose(s) of your plan.

Personal data *will* be part of the messages you are decrypting and
recording for analysis. Further, to perform targeted reports such as
described you must also associate the data with accounts somehow.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Gather POST request on HTTPS traffic?

Roeeklinger60
Hey Amos,

Thanks for your response, I will try to implement this today and check if I can get the data I am looking for.

I do however have a few questions regarding this approach:
1. If I understand the docu currently, then the server is getting a response which is identical to the client, meaning the server should not detect anything unusual? The last thing I want is for Instagram to detect something unusual and ban our clients Instagram accounts.
2. You said I will need to figure out a way to identify accounts, in Chrome the requests contain the info for both the accounts performing the action and the account receiving the action, should I see the same in these requests?
3. By “personal” data we are referring to data generated by our clients accounts, which are paying and willing for us to collect it to improve our service, of course it will also contain data on the account which they are performing the actions on, but this is not something that is not visible on the Instagram app, is there anything else I should be aware of that might be a privacy issue?
4. While this is great for my use case, is this something I should be aware of when using outside proxies on our machine? Can any proxy service simply decrypt and log our personal data? Seems like a security  vulnerability I should be aware of.

Thanks again.

On Nov 17, 2020, at 02:17, Amos Jeffries <[hidden email]> wrote:

On 17/11/20 12:14 pm, roee klinger wrote:
Hello everyone,
I work at a digital agency that has quite a few machines that are managing some Instagram accounts, they are all running in the same LAN and we are using Squid as a proxy to log and analyze some usage statistics and to make sure the machines are only used for Instagram.
We had an idea to use Squid to capture the POST data of users on the proxy level, for example, likes, follows, comments, etc so we can log and analyze everything in a convenient central way, so we can analyze it and even send out clients a monthly report of all the actions their accounts made (who they followed, what they liked, etc).
I can easily see the requests that I want to capture inside the "network" tab in Chrome but the problem is that Instagram uses HTTPS, so I can't seem to be able to capture this data.
Is there any way for me to log this data via Squid using the POST data or any other way?

Access to HTTPS transactions for a domain you do not own requires the SSL-Bump feature to decrypt ("bump") the TLS layer.
see <https://wiki.squid-cache.org/Features/SslPeekAndSplice>.

You could use cache.log with "debug_options ALL,1 11,2" configured to log the transactions. However an ICAP service or eCAP module that does both the record and analyze for you is probably better.


Note: We are aware of the legal issues, all machines connected to the network are company property, and all the accounts are client accounts that allow us to gather statistics. No personal account data will be gathered.


Please be aware:
 That statement conflicts with the stated purpose(s) of your plan.

Personal data *will* be part of the messages you are decrypting and recording for analysis. Further, to perform targeted reports such as described you must also associate the data with accounts somehow.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Gather POST request on HTTPS traffic?

Amos Jeffries
Administrator
On 18/11/20 1:41 am, roee klinger wrote:

> Hey Amos,
>
> Thanks for your response, I will try to implement this today and check
> if I can get the data I am looking for.
>
> I do however have a few questions regarding this approach:
> 1. If I understand the docu currently, then the server is getting a
> response which is identical to the client, meaning the server should not
> detect anything unusual? The last thing I want is for Instagram to
> detect something unusual and ban our clients Instagram accounts.

That depends on the what you configure. Interception is always
detectable, though most services have limited detection (if they care at
all).


> 2. You said I will need to figure out a way to identify accounts, in
> Chrome the requests contain the info for both the accounts performing
> the action and the account receiving the action, should I see the same
> in these requests?

Yes. That is what I mean by personal data *will* be gathered.


> 3. By “personal” data we are referring to data generated by our clients
> accounts, which are paying and willing for us to collect it to improve
> our service, of course it will also contain data on the account which
> they are performing the actions on, but this is not something that is
> not visible on the Instagram app, is there anything else I should be
> aware of that might be a privacy issue?


That definition confirms the false nature of "No personal account data
will be gathered." - having permission to gather does not negate the
existence of gathering.

Just make sure you have a real lawyers opinion / advice on the situation
details.


> 4. While this is great for my use case, is this something I should be
> aware of when using outside proxies on our machine? Can any proxy
> service simply decrypt and log our personal data? Seems like a security
>   vulnerability I should be aware of.
>

You will notice when configuring SSL-Bump that you must install signing
CA certificates used by your proxy into the clients software. Without
that CA trust you cannot bump.

The possibility of bumping (or lack of) is true for any intermediary
software.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Gather POST request on HTTPS traffic?

Roeeklinger60
Thanks for the replay Amos,
You will notice when configuring SSL-Bump that you must install signing
CA certificates used by your proxy into the clients software. 

I understand, this is something I missed apparently.

Sometimes I am using proxies for scraping which detect if the scraping is successful and run the request
from a different proxy if it isn't, they even go as far as automatically solving captcha's for the client or changing
content on the page, I am pretty new to this but these feature seems impossible to me on HTTPS connections
without having access to the client's machines.

Is there something I am missing or misunderstanding?
I cannot seem to find a good place to start reading about this.

Thanks.




On Tue, Nov 17, 2020 at 3:22 PM Amos Jeffries <[hidden email]> wrote:
On 18/11/20 1:41 am, roee klinger wrote:
> Hey Amos,
>
> Thanks for your response, I will try to implement this today and check
> if I can get the data I am looking for.
>
> I do however have a few questions regarding this approach:
> 1. If I understand the docu currently, then the server is getting a
> response which is identical to the client, meaning the server should not
> detect anything unusual? The last thing I want is for Instagram to
> detect something unusual and ban our clients Instagram accounts.

That depends on the what you configure. Interception is always
detectable, though most services have limited detection (if they care at
all).


> 2. You said I will need to figure out a way to identify accounts, in
> Chrome the requests contain the info for both the accounts performing
> the action and the account receiving the action, should I see the same
> in these requests?

Yes. That is what I mean by personal data *will* be gathered.


> 3. By “personal” data we are referring to data generated by our clients
> accounts, which are paying and willing for us to collect it to improve
> our service, of course it will also contain data on the account which
> they are performing the actions on, but this is not something that is
> not visible on the Instagram app, is there anything else I should be
> aware of that might be a privacy issue?


That definition confirms the false nature of "No personal account data
will be gathered." - having permission to gather does not negate the
existence of gathering.

Just make sure you have a real lawyers opinion / advice on the situation
details.


> 4. While this is great for my use case, is this something I should be
> aware of when using outside proxies on our machine? Can any proxy
> service simply decrypt and log our personal data? Seems like a security
>   vulnerability I should be aware of.
>

You will notice when configuring SSL-Bump that you must install signing
CA certificates used by your proxy into the clients software. Without
that CA trust you cannot bump.

The possibility of bumping (or lack of) is true for any intermediary
software.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Gather POST request on HTTPS traffic?

Eliezer Croitoru-3
In reply to this post by Amos Jeffries
Hey Roee,

From what I remember the best solution would be to use an eCAP module in the long term.
You can use the debug_options and it will work good.
The main issue with this is the DISK IO.
If you do have beefy hardware and SSD+RAM on the machine then the debug_options might be good enough for you.

But the most important thing is to test and verify if it works in your specific environment.

All The Bests,
Eliezer

----
Eliezer Croitoru
Tech Support
Mobile: +972-5-28704261
Email: [hidden email]

-----Original Message-----
From: squid-users <[hidden email]> On Behalf Of Amos Jeffries
Sent: Tuesday, November 17, 2020 2:09 AM
To: [hidden email]
Subject: Re: [squid-users] Gather POST request on HTTPS traffic?

On 17/11/20 12:14 pm, roee klinger wrote:

> Hello everyone,
>
> I work at a digital agency that has quite a few machines that are
> managing some Instagram accounts, they are all running in the same LAN
> and we are using Squid as a proxy to log and analyze some usage
> statistics and to make sure the machines are only used for Instagram.
>
> We had an idea to use Squid to capture the POST data of users on the
> proxy level, for example, likes, follows, comments, etc so we can log
> and analyze everything in a convenient central way, so we can analyze it
> and even send out clients a monthly report of all the actions their
> accounts made (who they followed, what they liked, etc).
>
> I can easily see the requests that I want to capture inside the
> "network" tab in Chrome but the problem is that Instagram uses HTTPS, so
> I can't seem to be able to capture this data.
>
>
> Is there any way for me to log this data via Squid using the POST data
> or any other way?
>

Access to HTTPS transactions for a domain you do not own requires the
SSL-Bump feature to decrypt ("bump") the TLS layer.
  see <https://wiki.squid-cache.org/Features/SslPeekAndSplice>.

You could use cache.log with "debug_options ALL,1 11,2" configured to
log the transactions. However an ICAP service or eCAP module that does
both the record and analyze for you is probably better.


>
> Note: We are aware of the legal issues, all machines connected to the
> network are company property, and all the accounts are client accounts
> that allow us to gather statistics. No personal account data will be
> gathered.


Please be aware:
   That statement conflicts with the stated purpose(s) of your plan.

Personal data *will* be part of the messages you are decrypting and
recording for analysis. Further, to perform targeted reports such as
described you must also associate the data with accounts somehow.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users