[Sugar-devel] Draft Proposal for Add AI to Chat activity

Chihurumnaya Ibiam ibiam at sugarlabs.org
Fri Mar 29 10:39:50 EDT 2024


Your timeline is terse, it'll be great if you add some implementation
detail to your proposal as there's
scarcely any.

-- 

Ibiam Chihurumnaya
ibiam at sugarlabs.org



On Thu, Mar 28, 2024 at 7:54 PM Sujay R <sujay1844 at gmail.com> wrote:

> Ok, the cloud sounds good.
>
> And thank you for patiently discussing the requirements with me and
> answering my questions. Now, I think I can come up with a tentative
> timeline.
> I've attached v2 of my proposal with the timeline. Please let me know if
> there are any more changes. If none, I'll submit it on the GSoC website.
> The deadline is right around the corner
>
> On Wed, Mar 27, 2024 at 11:37 PM Chihurumnaya Ibiam <ibiam at sugarlabs.org>
> wrote:
>
>> We were thinking of a cloud option, but we haven't decided yet.
>>
>> --
>>
>> Ibiam Chihurumnaya
>> ibiam at sugarlabs.org
>>
>>
>>
>> On Tue, Mar 26, 2024 at 6:40 PM Sujay R <sujay1844 at gmail.com> wrote:
>>
>>> Thanks for explaining that. But it's still not clear to me where the
>>> FOSS LLM should be run. Not on the devices running Sugar, so is cloud the
>>> option you're looking for?
>>>
>>> On Tue, Mar 26, 2024 at 11:02 PM Chihurumnaya Ibiam <ibiam at sugarlabs.org>
>>> wrote:
>>>
>>>> The plan was never really to run an LLM on Sugar as that'll drastically
>>>> increase the size of the activity
>>>> and Sugar itself as Chat is a fructose
>>>> <https://wiki.sugarlabs.org/go/Development_Team/Release/Modules>
>>>> activity.
>>>>
>>>> --
>>>>
>>>> Ibiam Chihurumnaya
>>>> ibiam at sugarlabs.org
>>>>
>>>>
>>>>
>>>> On Tue, Mar 26, 2024 at 10:54 AM Sujay R <sujay1844 at gmail.com> wrote:
>>>>
>>>>> Sugar runs on a lot of devices including low end devices - 2GB ram -
>>>>>> and we intend to keep it that way, the chat activity is
>>>>>> typically used by more than one Sugar instance, the chatbot should
>>>>>> also be able to run on just one instance.
>>>>>
>>>>>
>>>>> Running LLMs on just 2GB ram is atleast a few years away. So the bot
>>>>> has to be hosted on an API. Cloud is a good option, there is serverless GPU
>>>>> inference and provisioned ones. One that I like is RunPod (serverless
>>>>> pricing <https://www.runpod.io/serverless-gpu> and provisioned pricing
>>>>> <https://www.runpod.io/gpu-instance/pricing>). Local hosting is also
>>>>> an option, for a 7B model, a moderately new (4-5 year old) GPU with 16GB
>>>>> VRAM. Running with lower RAM is possible but with excessive quantisation
>>>>> (rounding off) at the cost of quality and speed.
>>>>>
>>>>>
>>>>>> You can leverage the sugar-datastore if you need to store activity
>>>>>> related data.
>>>>>>
>>>>>
>>>>> Storing the chat history is not an issue. Time complexity for
>>>>> generation(inference) of a transfomer is O(n^2) where n is the number of
>>>>> tokens. So we need to be mindful of how much history do we actually need
>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sugarlabs.org/archive/sugar-devel/attachments/20240329/e3ca2bfa/attachment.htm>


More information about the Sugar-devel mailing list