-
-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More accurately record time spent consuming video media #261
Comments
Thanks for taking the time to write all that, greatly appreciated! I agree, and there is actually unmerged code that solves the issue for web-based media players through the use of the I'm personally pretty happy with the above solution, as it creates minimal complexity and works for the (what I suspect is) the most common ways people consume video on their computers today (YouTube, Netflix, other web-based players). I'd love to give more thorough feedback on the options you mentioned, especially tagging, but I'm really busy with exams this week so it'll have to wait. In the meanwhile, check out the discussion in #95 |
This is an idea aimed towards video-playing apps, which is a big part of consuming media. Make a separate watcher for the media. The watcher could be either just for the PC media software or for both(it would get the audible property from the web watchers). Have a white-list of apps, check if they're on the screen. To count time on the media-watcher, you just count the time the app is on display. We have the same downside that @douglasg14b mentioned, i.e What if user really pauses the app and leaves whilst app is on the screen? The solution: Check if the computer is asleep or not. On both, Linux, Mac and Windows, the computer will usually not go to sleep if there is media(This would work for video, not sure about purely audio) playing. One thing to further investigate is whether the media software needs to be in full screen and playing for the computer not to go to sleep. Therefore, if there is a media app on the screen, and the computer is not asleep, we would log that as playing time for the app. Following false positives would occur:
|
Detecting active audio alongside a list of sites/apps would bring the accuracy up to a very acceptable level in my opinion. Either of them by themselves would be too riddled with false positives to be too terribly useful. There isn't much need to go fancier than that imho. This is something I went into in the initial post.
Letting users create their own pattern matching for sites will also bring up the accuracy as it lets them add sites/applications to the list of video apps/sites. |
I agree that detecting audio would be the most accurate way of doing it. As you've stated, it is a complex solution. The solution I offered was meant as a less complex alternative (i think it's fair to say it would be easier to implement with the existing code, and might have a lot fewer edge cases than if we go into monitoring hardware), but at the sake of less accuracy. It ultimately depends on the amount of effort that will be put into the feature. If monitoring hardware to detect audio successfully will take too many man-hours to implement at the time being, I think the solution I suggested is a feasible proposal. I disagree that it would be so riddled with false positives to not be useful. |
I believe that relying on standby is an incorrect assumption for users of this library. How many people's devices that are not laptops go into standby within a few minutes of it being idle? Even plugged in laptops default to 30-60mins on Windows 10 in balanced mode, what about high performance mode? On desktops?. You're looking at 30m - 4h IF standby is even enabled, and their devices don't just turn off the screens and stay on. Nevermind most Linux users who probably don't use standby at all from what I've seen as it's usually not on by default for most desktop installs of common flavors That's a lot of invalid data. If you're watching a movie, and you step away to do something (bathroom, cleaning, walk to dog, cooking, make coffee...etc) are most people actually gone long enough for their device to go into standby (60mins)? That would also rely VERY heavily on the end users setup, which can vary wildly, especially when assuming that their power configuration is not set as the defaults. Assumptions on user device configuration shouldn't be made unless there is data to back it up. Which is why I believe it will be more inaccurate, and potentially worse than just not recording it at all as it currently does. Thankfully this is a FOSS project, so man-hours isn't as much of a concern as if this was an in-company product with expenses and wages to worry about. It's still relevant, but at least in my projects, I don't consider time to implement as a deciding factor for features or compatibility unless a solid and usable drop-in is available. |
Has anyone investigated integrating with media players through the same mechanisms as last.fm/audio scrobblers? |
Another idea would be to take a screenshot and if the screen has changed, consider it active (not afk). |
@dynamiclover We have aw-watcher-spotify as an experiment https://github.com/activitywatch/aw-watcher-spotify
@jtrakk Two issues with this
|
I think that actually might be a viable solution, the issues you mentioned are solvable. I wouldn't take it off the table just yet. It's also very simple, and doesn't have a lot of complexities compared to monitoring audio. You can probably even use something like OpenCV for this, which has a lot of utilities that make this even simpler like absdiff. Down-scale the image, which actually does two things
Perform cheap math to check the delta from one image to another
Don't monitor in real time.
Also refer to https://stackoverflow.com/questions/4196453/simple-and-fast-method-to-compare-images-for-similarity |
@douglasg14b I'd gladly help getting it to work with aw-server and the web-ui if you want to make such an watcher for ActivityWatch, we love to help anyone who wants to collect more data to activitywatch and make it possible for them to analyze it. You can even write the watcher in C# if that's the language you prefer, we have one watcher already which is written in that which you can get some inspiration from (https://github.com/LaggAt/ActivityWatchVS) However I don't think this is something we want to ship with activitywatch by default and definitely not have turned on by default because:
I personally don't want to spend time on this because I find there to be more important things to fix currently. |
This seems doable with Pillow and pyscreenshot. Something like this might work, perhaps as a third-party watcher package. import time
import pyscreenshot
import requests
BUCKET_URL = "http://localhost:5600/api/0/buckets/screenshot-rgb"
INTERVAL = 10
requests.post(BUCKET_URL)
while True:
# Take a screenshot.
im = pyscreenshot.grab(childprocess=False)
# Get average value for each RGB channel.
rgb = im.resize((1, 1)).getpixel((0, 0))
# Post the rgb values.
requests.post(BUCKET_URL + "/heartbeat", json={"rgb": list(rgb)})
# Wait a few seconds before repeating.
time.sleep(INTERVAL) |
@jtrakk Nice start, a few suggestions:
|
I would like to mention the power management tool |
What about doing it inside of the extensions? Proposal:
Advantages:
Disadvantages:
I believe that since majority of media is consumed online, the advantages outweigh the disadvantages. EDIT: further problem EDIT 2: A potential problem here is this would not detect content in iframes. A comparatively small (but existent) amount of media is done through iframes. Example is reuters (go watch an article, and it should pop an iframe with an embedded video). Another example is embedding youtube videos, which is also done through iframes. Maybe there are workarounds for this. The only one I found so far is checking for the 'autoplay` property, which if set, indicates video content. However, this is not foolproof. On further analysis seems like the 'audible' property is indeed the better choice and given that it is the active tab and audio is playing it should indicate the user is watching some content |
I'm currently going through my AW database reviewing all events tagged as
There are a couple of false positives:
Since they are purely audio, it is very likely(more often than not, I would say) that a user puts on some music/podcast/radio etc, and then works outside of their computer: typing notes, cleaning, etc. So I think we could have a whitelist of these websites where we consider content as 'afk' even if they have their audio property set to true. |
Found a way to do this directly with Sound Drivers using a Python library called SoundCard. This works with any type of applications, not just web browsers. Windows and LinuxTested successfully on both Linux (relies on PulseAudio, so should work on all distributions by default) and Windows (relies on WASAPI, works on Windows 7+). #!/usr/bin/env python
import soundcard as sc
import numpy as np
'''Get a microphone from a speaker, not the actual microphone'''
def getMic():
mic = None
mics = sc.all_microphones(include_loopback=True)
for a_mic in mics:
if(a_mic.isloopback):
mic = a_mic
break
return mic
def checkAudio(mic):
isAudio = False
if(mic is not None):
# record 1 second
data = mic.record(samplerate=48000, numframes=48000)
isAudio = np.any(data != 0)
return (isAudio)
mic = getMic()
checkAudio(mic) MacThis will not work by default on MacOS because it does not provide loopback functionality.
I don't have a Mac to test this, so somebody should confirm to see if this works. |
@nicolae-stroncea I'll try this on my Macbook and report back shortly. |
@jmealo Not sure if you already found it, but this tutorial seemed useful to me. It helps avoid some potential pitfalls of the setup, specifically that if you don't set multi-output, your Mac won't play any sound at all since all of it will be routed only to SoundFlower. It also explains how to select SoundFlower as an input device, which is what we need |
I'll still test Soundflower, but, I found this: When I play a YouTube video in Chrome:
When I paused the video I fired up Zoom, with no meeting there was no output, upon starting a new meeting:
As far as false positives go: assuming a browser extension, you can differentiate between listening to music/watching a video. If you poll this at regular intervals, you don't have to worry about notifications much. It seems like video conferencing will prevent the display from sleeping. I can test with something that uses WebRTC and verify. |
I'm providing the output of some While playing a Youtube video in Chrome:
While in a Zoom meeting (it looks like the developers forgot to provide the correct value for the activity):
|
Also found this command, which seems to draw inspiration from same source : It doesn't have the same level of detail, but can give a quick, cheap check if audio is playing |
@nicolae-stroncea: you can do all sorts of activity tracking beyond what you set out to do on OSX with I wasn't able to get your Python to run, is it Python 2? I think it's a dead-end (but good idea! especially without having access to the hardware) given what I'm able to do by tailing the power telemetry from OSX. Using |
@jmealo that's pretty neat! I imagine there's a lot of nice aw-watcher possibilities lying in there. The script is Python3, but it would need some customizing for Mac to get it working with
I agree that since I looked for a similar command that could be useful on Linux, and found:
There are a couple of weird quirks that I didn't figure out about this. If I mute an application (but allow it to run), it will still show up with Wasn't able to find any similar command for Windows that we would be able to trigger directly from Python, but I'm not too familiar with developing on the platform. Worst case, soundcard could still be used for the cases where a reliable low-overhead platform-dependent command is not found. |
For what it's worth:
What a great find! I was looking to see if
It looks like |
It looks like the output of
Here's the documentation that I found for the command so far: |
I just tested on Windows: No video/audio playing:
Video playing:
Audio playing:
|
@jmealo nice find! I just tested it, and it works well. Unfortunately, it requires administrative privileges. I had to run powershell as an administrator to get it to work. Here's the output I got when playing a youtube video for it:
|
I'm sure I'm not the first to come up with this, but what about doing away with complex heurestics and just having a switch "Consider time spent in fullscreen applications as not-afk"?
|
Assuming `Full screen -> active` is a heuristic too, and I can see many
ways for it to fail (you watch a movie/play a game, but pause/leave the
device for a while to take a break).
I'm not even sure if it's feasible to detect if an application is full
screen in a reliable and cross-platform manner.
|
Yup, but it's a simple one, and as I've said above, I can't really think of an instance where it would fail for me (or at least, fail worse than what we've got now), and if it does, I think I would be fine with the result.
That would result in a couple minutes of wrongly tagged time, as opposed to literal hours. Additionally, pausing an online game does vaguely match what I would consider "not-AFK" time - a fullscreen app running directly implies that attention is meant to be paid to the computer. I understand this may be a point of contention, but that's why I'm appealing to the simplicity of the heurestic. It's easy to read, since it doesn't rely on complex OS machinery relating to audio, or combinations of window titles and somewhat arbitrary (1hr30) idleness thresholds, etc. If there's an artifact, it's likely going to be small(er) and easily recognized. And if it's an issue, well, it's a single switch after all. It can even by false by default.
Well, this thread seems to be considering metrics which so far do not seem feasible to detect even on any single operating system! :) So I didn't think this would be that much of a hurdle. The bottom line is: This is a somewhat major issue for an activity tracking software, and has been unsolved for more than 2 years. I do appreciate its difficulty and complexity, and I don't claim that fullscreen-tracking is the final solution, but! I think it's a feasible partial solution, and if it were present, I would have simply toggled it on, and went on with my day. "Don't let perfect be the enemy of the good", yadda yadda. |
I don't think it's any simpler than most of the other things already suggested (but it's a good addition still!).
Everyone's usage is different. I often leave my computer with a game running for hours. I also sometimes fall asleep to a video playing. Personally, I'm not inclined to implement & maintain a feature I won't have any use of myself.
It does! But those hurdles are exactly why it hasn't been implemented. Although I disagree that the other proposed solutions aren't feasible to detect on a single OS, and I think it's about as feasible to check if audio is playing vs if a window is fullscreen (but hard to speculate without seeing example code for the latter).
100%. However, the 'good' solution that's the most likely to get implemented anytime soon is using the |
I just merged ActivityWatch/aw-webui#262 which implements the "audible-as-active" feature. It makes it so that if your browser is the active app, and the active browser tab is audible (playing sound), then it will not count that time as AFK (and therefore make it show on your Activity view). It requires that you're running the web watcher for your browser. This vastly improves the situation when you watch a video in your web browser, but it is not a complete solution, so I'll leave the issue open for now. |
Will this be available in the nightly build soon? It seems that the |
@archiif I just updated aw-server and aw-webui in the main activitywatch repo, should hopefully work. |
@archiif I have fixed the integrations tests now too so the nightly builds are now working again. |
Is that optional? I use the Chrome plugin, and I have music (youtube video though) running pretty much all the time no matter if I'm in front of it or not. |
@luckydonald Yes it's optional, you can turn it off in the settings. |
what about zoom or conference call apps. I am afraid I find zoom being AFKed most of the time. And it counts actually towards my productivity hours 😞 |
I have the same issue - very happy to contribute if someone can help shape the solution. What about having an exclusion regexp for the afk watcher? |
Just wanted to say I had the same issue with conferencing apps like Zoom, Teams or Skype. Is there any solution to that? Thank you. |
I've been thinking of adding a "always count as active" regex configuration rule (possibly with a max-duration, such that actually long AFK periods don't get included by accident). This would be easy to implement and work as a decent workaround, I think. Would also help in some situations I've noticed where gamepads aren't detected. Edit: basically what was suggested by @placeybordeaux #261 (comment) |
That would be great, I would be very grateful if you could do that!! Thank you in any case. |
I feel like that would be a great temporary solution for now. Tried messing around with manual SQL queries to edit existing afk data, and got nowhere. If there was a switch for certain applications, not even in the trackers but in the UI (which would probably be evevn easier to implement and doesn't actually modify data for now), then it would be great |
@8Dion8 It has been implemented, see ActivityWatch/aw-webui#375. Just need to wait for the next release now 👀. |
Awesome! Can't wait to see how much time I actually wasted on binging shows😂 |
I'm testing the new feature in ActivityWatch/aw-webui#375 which landed in v0.12.2b1 and I am seeing failures that cause the whole Activity page to stop loading.
|
Just a few notes about mpv under Linux wayland environment. Its manual says:
So maybe we can utilize this "idle-inhibit protocol". And this protocol is tied to the idle notify protocol, which says:
So if we implement idle notify in the afk watcher, it just works! Nothing need to be handled. |
The Problem
One of the larger time-sinks today is video. Be that through streaming services like Netflix, Amazon Prime Video, or HBO. Media sites like YouTube, Twitch, or Vimeo. Or from downloaded or streamed media on players like VLC or MPV.
Activitywatch, unfortunately, fails to effectively record the time spent on these activities as it relies on mouse and keyboard input to determine activity. This means when watching a video, Activitywatch will mark the time as afk after a short while. Even though the user is present, and spending time on an activity at their device.
This was brought up in #186 which was marked as
wontfix
. I believe that this something that CAN be solved, and should be seriously considered given the amount of time that can be spent consuming media.Afk time can be disabled, but this then pollutes the data. Users may go afk for a variety of times in a variety of applications or websites throughout the day, which could pollute the afk time to the point of video-specific time may no longer be useful.
Possible solutions
Note: Not all problems/disadvantages/pitfalls are meant to be solvable. I am including them for devils advocates sake and to foster a more robust discussion.
Application/Site Tagging
Compile a list of common media applications and websites. When the use goes
afk
on this site or application, mark the time as non-afk. This isn't technically tagging, but it could be setup to work in a tag-like way, which would make this into a very extendable solution for more than just videos.Advantages
Disadvantages/Pitfalls
Enhancements
These are here to try and solve for some of the problems presented under disadvantages.
Monitoring Hardware
Monitor audio output to see if a video is playing.
Advantages
Disadvantages/Pitfalls
Enhancements
General Enhancements
These are enhancements that could apply to any solution, to increase accuracy and to enable the user to correct mismatches and errors.
User-Defined Lists & Filters
Let the user create/modify the list of sites/applications, and/or the patterns used to match them.
Tagging and Pattern Matching
This goes above and beyond, but would really turn this into a much more powerful tool.
Instead of just solving the video problem. Create an extendable solution that encompass the general problem category that the video problem is part of. This would be in the form of tagging, being able to automatically tag domains & applications with predefined or user defined tags. This can be facilitated with pattern matching lists, and depending on the data's schema/format could be applied to already existing data greatly enhancing it's utility.
As an example, time in VLC, YouTube, or Netflix could be tagged as
video
, which gives users the power to filter this time separately, combine it in reports, or to more easily correct collection errors.This of course could be setup to be user-manageable, with points listed in the previous section.
Conclusion
I believe the ability to capture time spent consuming video-based media will have an impact on the future usability of this project as these sorts of services continue to expand and bring in more and more people. Solving this problem can not only provide a solution to this problem, but could also greatly enhance the utility and power of this application.
What are your thoughts? (please to not be automarking this as closed, this took some time and effort to create).
The text was updated successfully, but these errors were encountered: