[-] Sal@mander.xyz 13 points 2 months ago

Check in your settings whether you have disabled the visibility of bot responses. This can happen if bots replied to you and your settings are set to not see them.

72
submitted 3 months ago by Sal@mander.xyz to c/opensource@lemmy.ml

Cross-posting to the OpenSource community as I think this topic will also be of interest here.

This is an analysis of how "open" different open source AI systems are. I am also posting the two figures from the paper that summarize this information below.

ABSTRACT

The past year has seen a steep rise in generative AI systems that claim to be open. But how open are they really? The question of what counts as open source in generative AI is poised to take on particular importance in light of the upcoming EU AI Act that regulates open source systems differently, creating an urgent need for practical openness assessment. Here we use an evidence-based framework that distinguishes 14 dimensions of openness, from training datasets to scientific and technical documentation and from licensing to access methods. Surveying over 45 generative AI systems (both text and text-to-image), we find that while the term open source is widely used, many models are ‘open weight’ at best and many providers seek to evade scientific, legal and regulatory scrutiny by withholding information on training and fine-tuning data. We argue that openness in generative AI is necessarily composite (consisting of multiple elements) and gradient (coming in degrees), and point out the risk of relying on single features like access or licensing to declare models open or not. Evidence-based openness assessment can help foster a generative AI landscape in which models can be effectively regulated, model providers can be held accountable, scientists can scrutinise generative AI, and end users can make informed decisions.

Figure 2 (click to enlarge): Openness of 40 text generators described as open, with OpenAI’s ChatGPT (bottom) as closed reference point. Every cell records a three-level openness judgement (✓ open, ∼ partial or ✗ closed). The table is sorted by cumulative openness, where ✓ is 1, ∼ is 0.5 and ✗ is 0 points. RL may refer to RLHF or other forms of fine-tuning aimed at fostering instruction-following behaviour. For the latest updates see: https://opening-up-chatgpt.github.io

Figure 3 (click to enlarge): Overview of 6 text-to-image systems described as open, with OpenAI's DALL-E as a reference point. Every cell records a three-level openness judgement (✓ open, ∼ partial or ✗ closed). The table is sorted by cumulative openness, where ✓ is 1, ∼ is 0.5 and ✗ is 0 points.

There is also a related Nature news article: Not all ‘open source’ AI models are actually open: here’s a ranking

PDF Link: https://dl.acm.org/doi/pdf/10.1145/3630106.3659005

[-] Sal@mander.xyz 34 points 7 months ago

If the timing is right, I would bring a mushroom grow bag with mushrooms sprouting.

If not... probably my radiacode gamma spectrometer and some of my radioactive items. Maybe a clock with radium painted dials and a piece of trinitite. I think that there are many different points of discussion that can be of interest to a broad audience (radioactivity, spectroscopy, electronics, US labor law story of the radium girls, nuclear explosions, background radiation.... etc). As a bonus I can bring a UV flash light and show the radium fluorescence. Adults love UV flash lights.

[-] Sal@mander.xyz 30 points 7 months ago

First of all, congratulations for bringing a baby girl into this world!! You must be really excited! I am very happy for you!

This looks very cool. I set up a wiki (https://ibis.mander.xyz/) and I will make an effort to populate it with some Lemmy lore and interesting science/tech 😄 Hopefully I can set some time aside and help with a tiny bit of code too.

[-] Sal@mander.xyz 30 points 7 months ago

Search engines like google aggregate data from multiple sites. I may want to download a datasheet for an electronic component, find an answer to a technical question, find a language learning course site, or look for museums in my area.

Usually I make specific searches with very specific conditions, so I tend to get few and relevant results. I think search engines have their place.

[-] Sal@mander.xyz 17 points 8 months ago

Is the fact that I recognize this comment evidence that I use Lemmy a bit too much? 😅

[-] Sal@mander.xyz 15 points 9 months ago

You can set up a personalized RSS feed with Feeder. It will take a bit of effort to set up, but you can create a feed that is very well tailored to your interests. You can get news feeds but also subscribe to other kinds of content, like scientific publications and financial statements.

34
submitted 9 months ago by Sal@mander.xyz to c/collapse@sopuli.xyz
[-] Sal@mander.xyz 12 points 9 months ago

A botnet could have many unique accounts, and some could even appear like users. So I can't rule it out. I also haven't done a deeper dive into the accounts.

But when a post gets popular I would expect it to get at least a few downvotes, regardless of what it is.

[-] Sal@mander.xyz 28 points 9 months ago

There is one account that has a single comment from 5 months ago that is downvoting most posts and comments. That one is very suspicious

Other than that... No other accounts are as obvious. A few do have some reoccurrences but most of those votes do seem organic on first inspection.

[-] Sal@mander.xyz 24 points 9 months ago* (last edited 9 months ago)

Yes, you are right. If a mod wants I can send them the username and they can ban them from the community. I can see it as an admin from my instance but I can't take action.

[-] Sal@mander.xyz 22 points 9 months ago* (last edited 9 months ago)

I can tell you one benefit: Money. Most of my server's costs come from storing federated content. Federating with threads would likely be expensive.

[-] Sal@mander.xyz 43 points 1 year ago* (last edited 1 year ago)
  • Password hashing occurs server-side. Even without removing the hashing step an admin can intercept the plaintext password during login. Use unique safe passwords.

  • An admin can intercept the jwt authentication cookie and use any account that lives in the instance.

  • Private messages are stored as plaintext in the database

  • Admins can see who upvotes/downvotes what

  • These are not things that are unique to Lemmy. This is common.

  • To avoid having to trust your admin, run an instance.

[-] Sal@mander.xyz 31 points 1 year ago* (last edited 1 year ago)

Full genome sequencing.

The price of sequencing continues to decrease as the technology evolves. I have already seen claims of under $1,000 for a full human genome. I haven't looked carefully into those claims, but I think we are around there. In some years full genomes will be so cheap to sequence that it will be routine. I want to buy one of those small Oxford Nanopore MinION sequencers in the future. I'll use it like a pokedex.

view more: next ›

Sal

joined 2 years ago